Page 2 of 3

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Tue May 30, 2023 9:11 pm
by Lazvon
Same issue for me, though I caught it during Downloading. I wish I'd remembered this thread and I would have done a packet capture then to see if it high packet loss or high latency (network related) or zero window sizes or delayed timestamped ACKs (server related).

I'll setup Wireshark with a capture filter to this 66.170.111.50 address and hopefully next time I'm down here I'll have some data as to where the issue is.

No matter what, not my network as you can see what kind of up/download I was getting from the system at the same time.

Image

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Fri Jun 02, 2023 6:38 pm
by Lazvon
Okay, for me, over the two days I had the Wireshark packet captures going the uploads are all great, 40-70Mbps. The downloads however were bad for the first 7, then fast for the last 3. Reason looks like out-of-order packets and/or packet loss. Don't know if the 66.170.11.50 server had two NICs, or two paths upstream that it was doing per-packet load balancing across or just real packet loss, but you can see below what it was looking like.

That loss/out-of-order prevent TCP slow start from ever getting fast. I'm on a 2Gbps+ connection (to fast.com as seen above, actually 5Gbs to my personal colocated machines) from this machine, and when there is no loss, it will build up to a 16MB TCP Receive Window for most of the transfer, the part of the packet capture I screen capped below, shows the first doubling to 512KB from the starting 256KB on the "good" 14Mbps download from the server (tcp.stream==19). On the bad example (tcp.stream==11) it just stayed at 256KB to the very end and you can see the read bars of loss/OOO all the way through the transmission.

No matter what, looks like they fixed it... not sure if it is a temporary or permanent fix though. If I don't catch it, I'm gonna chalk it up as "fixed".

Image

Image

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Fri Jun 02, 2023 9:18 pm
by Lazvon
Bah! Humbug!

As soon as I posted that, I noticed another of my folder just stuck in "Ready" state, went to look at it and sure enough it was trying to download from 66.170.111.50. I actually downloaded and installed Wireshark while the problem was still going on, captured packets - sure enough packet loss and out-of-order packets. TCP Receive Window sizes like 5KB and 20KB. Could barely transfer... slower than a modem.

I exited FAH Client, rebooted, and it grabbed a different WU and immediately started estimating 9M/day instead of the 20/day (yes, TWENTY) it had right before.

Oh looky, while I'm typing this, my machine gets quiet and sure enough, trying to download from 66.170.111.50 yet again. Argh.

This server needs to be taken offline until they figure out their problem. If the sysadmin wants some help, I'm happy to help if needed.

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Fri Jun 02, 2023 9:25 pm
by Lazvon
Yeah, woo hoo... getting up to 7kbs. Blazing. :(

Image

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Fri Jun 02, 2023 9:39 pm
by Lazvon
And have to restart again to get a decent work unit.

Hmm. I think I'll just put a firewall rule in to deny connections to 66.170.111.50. I wonder how the FAH Client will handle that. No worse than trying to download a 20MB file at 3Kbs for 2-hours at a time. :^(

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Fri Jun 02, 2023 9:40 pm
by Lazvon
Argh! Now another of my folders seems to have gotten stuck. Grrrr. Blocking on FW, see what happens...

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Fri Jun 02, 2023 11:45 pm
by Lazvon
Now 2 out of 7 folders sitting there "Ready" while downloading from this horrible server. *sigh*

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Sun Jun 04, 2023 2:54 pm
by Joe_H
This server has had its software upgraded. If you still see slow connections to it, please provide the IP address of your client either here or send it to me in a PM. I will pass that on to the developer.

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Sun Jun 04, 2023 3:35 pm
by Lazvon
Will remove my firewall rule and see if it comes back.

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Sun Jun 04, 2023 4:06 pm
by Lazvon
Already see one stuck on “Ready” on HFM.

Will get down to the basement soon as I can. I am pretty sure I left Wireshark running with a capture filter for 66.170.111.50 on that one.

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Sun Jun 04, 2023 6:37 pm
by Lazvon
Yeah, I did get a capture of it happening. This time it was the upload versus a download. Tons of Dup ACKs and Out-of-Order packets. Definitely network issues there.

Image

Here is a log, like everyone else's:

Code: Select all

18:02:22:WU01:FS01:0x22:Completed 1237500 out of 1250000 steps (99%)
18:02:22:WU00:FS01:Connecting to assign1.foldingathome.org:80
18:02:22:WU00:FS01:Assigned to work server 140.163.4.210
18:02:22:WU00:FS01:Requesting new work unit for slot 01: gpu:1:0 AD103 [GeForce RTX 4080] from 140.163.4.210
18:02:22:WU00:FS01:Connecting to 140.163.4.210:8080
18:02:23:WU00:FS01:Downloading 9.15MiB
18:02:24:WU00:FS01:Download complete
18:02:24:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:17637 run:0 clone:79 gen:82 core:0x22 unit:0x0000004f00000052000044e500000000
18:03:31:WU01:FS01:0x22:Completed 1250000 out of 1250000 steps (100%)
18:03:31:WU01:FS01:0x22:Average performance: 30.9456 ns/day
18:03:45:WU01:FS01:0x22:Checkpoint completed at step 1250000
18:03:54:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
18:03:54:WU01:FS01:0x22:Saving result file checkpointIntegrator.xml
18:03:54:WU01:FS01:0x22:Saving result file checkpointState.xml
18:04:01:WU01:FS01:0x22:Saving result file positions.xtc
18:04:01:WU01:FS01:0x22:Saving result file science.log
18:04:01:WU01:FS01:0x22:Saving result file xtcAtoms.csv.bz2
18:04:01:WU01:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
18:04:02:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
18:04:02:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:16575 run:4 clone:141 gen:49 core:0x22 unit:0x310000008d00000004000000bf400000
18:04:02:WU01:FS01:Uploading 38.26MiB to 66.170.111.50
18:04:02:WU01:FS01:Connecting to 66.170.111.50:8080
18:04:02:WU00:FS01:Starting
18:04:02:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.20/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 12248 -checkpoint 15 -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor nvidia -gpu 0 -gpu-usage 100
18:04:02:WU00:FS01:Started FahCore on PID 2572
18:04:02:WU00:FS01:Core PID:5448
18:04:02:WU00:FS01:FahCore 0x22 started
18:04:03:WU00:FS01:0x22:*********************** Log Started 2023-06-04T18:04:02Z ***********************
18:04:03:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
18:04:03:WU00:FS01:0x22:       Core: Core22
18:04:03:WU00:FS01:0x22:       Type: 0x22
18:04:03:WU00:FS01:0x22:    Version: 0.0.20
18:04:03:WU00:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:04:03:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
18:04:03:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
18:04:03:WU00:FS01:0x22:       Date: Jan 20 2022
18:04:03:WU00:FS01:0x22:       Time: 01:15:36
18:04:03:WU00:FS01:0x22:   Revision: 3f211b8a4346514edbff34e3cb1c0e0ec951373c
18:04:03:WU00:FS01:0x22:     Branch: HEAD
18:04:03:WU00:FS01:0x22:   Compiler: Visual C++
18:04:03:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
18:04:03:WU00:FS01:0x22:             -DOPENMM_VERSION="\"7.7.0\""
18:04:03:WU00:FS01:0x22:   Platform: win32 10
18:04:03:WU00:FS01:0x22:       Bits: 64
18:04:03:WU00:FS01:0x22:       Mode: Release
18:04:03:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
18:04:03:WU00:FS01:0x22:             <peastman@stanford.edu>
18:04:03:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 2572 -checkpoint 15
18:04:03:WU00:FS01:0x22:             -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu-vendor
18:04:03:WU00:FS01:0x22:             nvidia -gpu 0 -gpu-usage 100
18:04:03:WU00:FS01:0x22:************************************ libFAH ************************************
18:04:03:WU00:FS01:0x22:       Date: Jan 20 2022
18:04:03:WU00:FS01:0x22:       Time: 01:14:17
18:04:03:WU00:FS01:0x22:   Revision: 9f4ad694e75c2350d4bb6b8b5b769ba27e483a2f
18:04:03:WU00:FS01:0x22:     Branch: HEAD
18:04:03:WU00:FS01:0x22:   Compiler: Visual C++
18:04:03:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
18:04:03:WU00:FS01:0x22:   Platform: win32 10
18:04:03:WU00:FS01:0x22:       Bits: 64
18:04:03:WU00:FS01:0x22:       Mode: Release
18:04:03:WU00:FS01:0x22:************************************ CBang *************************************
18:04:03:WU00:FS01:0x22:       Date: Jan 20 2022
18:04:03:WU00:FS01:0x22:       Time: 01:13:20
18:04:03:WU00:FS01:0x22:   Revision: ab023d155b446906d55b0f6c9a1eedeea04f7a1a
18:04:03:WU00:FS01:0x22:     Branch: HEAD
18:04:03:WU00:FS01:0x22:   Compiler: Visual C++
18:04:03:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
18:04:03:WU00:FS01:0x22:   Platform: win32 10
18:04:03:WU00:FS01:0x22:       Bits: 64
18:04:03:WU00:FS01:0x22:       Mode: Release
18:04:03:WU00:FS01:0x22:************************************ System ************************************
18:04:03:WU00:FS01:0x22:        CPU: 12th Gen Intel(R) Core(TM) i7-12700F
18:04:03:WU00:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 151 Stepping 2
18:04:03:WU00:FS01:0x22:       CPUs: 20
18:04:03:WU00:FS01:0x22:     Memory: 31.85GiB
18:04:03:WU00:FS01:0x22:Free Memory: 27.42GiB
18:04:03:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
18:04:03:WU00:FS01:0x22: OS Version: 6.2
18:04:03:WU00:FS01:0x22:Has Battery: false
18:04:03:WU00:FS01:0x22: On Battery: false
18:04:03:WU00:FS01:0x22: UTC Offset: -4
18:04:03:WU00:FS01:0x22:        PID: 5448
18:04:03:WU00:FS01:0x22:        CWD: C:\ProgramData\FAHClient\work
18:04:03:WU00:FS01:0x22:************************************ OpenMM ************************************
18:04:03:WU00:FS01:0x22:    Version: 7.7.0
18:04:03:WU00:FS01:0x22:********************************************************************************
18:04:03:WU00:FS01:0x22:Project: 17637 (Run 0, Clone 79, Gen 82)
18:04:03:WU00:FS01:0x22:Reading tar file core.xml
18:04:03:WU00:FS01:0x22:Reading tar file integrator.xml.bz2
18:04:03:WU00:FS01:0x22:Reading tar file state.xml.bz2
18:04:03:WU00:FS01:0x22:Reading tar file system.xml.bz2
18:04:03:WU00:FS01:0x22:Digital signatures verified
18:04:03:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
18:04:03:WU00:FS01:0x22:Version 0.0.20
18:04:03:WU00:FS01:0x22:  Checkpoint write interval: 125000 steps (5%) [20 total]
18:04:03:WU00:FS01:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
18:04:03:WU00:FS01:0x22:  XTC frame write interval: 250000 steps (10%) [10 total]
18:04:03:WU00:FS01:0x22:  Global context and integrator variables write interval: disabled
18:04:03:WU00:FS01:0x22:There are 4 platforms available.
18:04:03:WU00:FS01:0x22:Platform 0: Reference
18:04:03:WU00:FS01:0x22:Platform 1: CPU
18:04:03:WU00:FS01:0x22:Platform 2: OpenCL
18:04:03:WU00:FS01:0x22:  opencl-device 0 specified
18:04:03:WU00:FS01:0x22:Platform 3: CUDA
18:04:03:WU00:FS01:0x22:  cuda-device 0 specified
18:04:08:WU01:FS01:Upload 1.14%
18:04:10:WU00:FS01:0x22:Attempting to create CUDA context:
18:04:10:WU00:FS01:0x22:  Configuring platform CUDA
18:04:14:WU01:FS01:Upload 2.45%
18:04:14:WU00:FS01:0x22:  Using CUDA and gpu 0
18:04:14:WU00:FS01:0x22:Completed 0 out of 2500000 steps (0%)
18:04:15:WU00:FS01:0x22:Checkpoint completed at step 0
18:04:20:WU01:FS01:Upload 3.27%
18:04:26:WU01:FS01:Upload 4.25%
18:04:32:WU01:FS01:Upload 5.23%
18:04:38:WU01:FS01:Upload 6.04%
18:04:44:WU01:FS01:Upload 7.19%
18:04:46:WU00:FS01:0x22:Completed 25000 out of 2500000 steps (1%)
18:04:50:WU01:FS01:Upload 8.00%
18:04:56:WU01:FS01:Upload 9.15%
18:05:02:WU01:FS01:Upload 10.13%
18:05:08:WU01:FS01:Upload 11.27%
18:05:15:WU01:FS01:Upload 12.41%
18:05:17:WU00:FS01:0x22:Completed 50000 out of 2500000 steps (2%)
18:05:22:WU01:FS01:Upload 13.07%
18:05:28:WU01:FS01:Upload 14.37%
18:05:34:WU01:FS01:Upload 15.03%
18:05:40:WU01:FS01:Upload 16.01%
18:05:46:WU01:FS01:Upload 16.99%
18:05:48:WU00:FS01:0x22:Completed 75000 out of 2500000 steps (3%)
18:05:52:WU01:FS01:Upload 18.13%
18:05:59:WU01:FS01:Upload 19.11%
18:06:05:WU01:FS01:Upload 20.09%
18:06:11:WU01:FS01:Upload 21.07%
18:06:17:WU01:FS01:Upload 22.05%
18:06:19:WU00:FS01:0x22:Completed 100000 out of 2500000 steps (4%)
18:06:23:WU01:FS01:Upload 23.20%
18:06:29:WU01:FS01:Upload 24.01%
18:06:35:WU01:FS01:Upload 24.99%
18:06:41:WU01:FS01:Upload 25.97%
18:06:47:WU01:FS01:Upload 26.95%
18:06:50:WU00:FS01:0x22:Completed 125000 out of 2500000 steps (5%)
18:06:51:WU00:FS01:0x22:Checkpoint completed at step 125000
18:06:53:WU01:FS01:Upload 27.93%
18:07:00:WU01:FS01:Upload 28.91%
18:07:06:WU01:FS01:Upload 29.57%
18:07:12:WU01:FS01:Upload 30.38%
18:07:18:WU01:FS01:Upload 31.53%
18:07:22:WU00:FS01:0x22:Completed 150000 out of 2500000 steps (6%)
18:07:24:WU01:FS01:Upload 32.83%
18:07:30:WU01:FS01:Upload 33.81%
18:07:36:WU01:FS01:Upload 34.96%
18:07:42:WU01:FS01:Upload 35.94%
18:07:48:WU01:FS01:Upload 37.08%
18:07:53:WU00:FS01:0x22:Completed 175000 out of 2500000 steps (7%)
18:07:54:WU01:FS01:Upload 38.06%
18:08:00:WU01:FS01:Upload 39.04%
18:08:06:WU01:FS01:Upload 40.18%
18:08:12:WU01:FS01:Upload 41.49%
18:08:18:WU01:FS01:Upload 42.63%
18:08:24:WU01:FS01:Upload 43.94%
18:08:24:WU00:FS01:0x22:Completed 200000 out of 2500000 steps (8%)
18:08:30:WU01:FS01:Upload 45.25%
18:08:36:WU01:FS01:Upload 46.23%
18:08:42:WU01:FS01:Upload 47.86%
18:08:48:WU01:FS01:Upload 48.84%
18:08:54:WU01:FS01:Upload 49.82%
18:08:55:WU00:FS01:0x22:Completed 225000 out of 2500000 steps (9%)
18:09:01:WU01:FS01:Upload 50.48%
18:09:07:WU01:FS01:Upload 51.46%
18:09:13:WU01:FS01:Upload 52.60%
18:09:19:WU01:FS01:Upload 53.74%
18:09:25:WU01:FS01:Upload 54.89%
18:09:26:WU00:FS01:0x22:Completed 250000 out of 2500000 steps (10%)
18:09:27:WU00:FS01:0x22:Checkpoint completed at step 250000
18:09:31:WU01:FS01:Upload 56.03%
18:09:37:WU01:FS01:Upload 57.50%
18:09:43:WU01:FS01:Upload 59.30%
18:09:49:WU01:FS01:Upload 60.60%
18:09:55:WU01:FS01:Upload 61.42%
18:09:58:WU00:FS01:0x22:Completed 275000 out of 2500000 steps (11%)
18:10:01:WU01:FS01:Upload 62.24%
18:10:07:WU01:FS01:Upload 63.38%
18:10:13:WU01:FS01:Upload 64.52%
18:10:21:WU01:FS01:Upload 65.34%
18:10:27:WU01:FS01:Upload 66.16%
18:10:29:WU00:FS01:0x22:Completed 300000 out of 2500000 steps (12%)
18:10:33:WU01:FS01:Upload 67.46%
18:10:39:WU01:FS01:Upload 68.44%
18:10:45:WU01:FS01:Upload 69.59%
18:10:51:WU01:FS01:Upload 70.40%
18:10:57:WU01:FS01:Upload 71.22%
18:11:00:WU00:FS01:0x22:Completed 325000 out of 2500000 steps (13%)
18:11:03:WU01:FS01:Upload 72.04%
18:11:09:WU01:FS01:Upload 73.34%
18:11:15:WU01:FS01:Upload 74.49%
18:11:21:WU01:FS01:Upload 75.63%
18:11:27:WU01:FS01:Upload 76.28%
18:11:31:WU00:FS01:0x22:Completed 350000 out of 2500000 steps (14%)
18:11:33:WU01:FS01:Upload 77.26%
18:11:39:WU01:FS01:Upload 78.41%
18:11:48:WU01:FS01:Upload 79.55%
18:11:54:WU01:FS01:Upload 80.37%
18:12:00:WU01:FS01:Upload 81.35%
18:12:02:WU00:FS01:0x22:Completed 375000 out of 2500000 steps (15%)
18:12:03:WU00:FS01:0x22:Checkpoint completed at step 375000
18:12:06:WU01:FS01:Upload 82.33%
18:12:13:WU01:FS01:Upload 83.31%
18:12:19:WU01:FS01:Upload 83.96%
18:12:25:WU01:FS01:Upload 84.94%
18:12:31:WU01:FS01:Upload 85.76%
18:12:34:WU00:FS01:0x22:Completed 400000 out of 2500000 steps (16%)
18:12:39:WU01:FS01:Upload 86.74%
18:12:45:WU01:FS01:Upload 87.88%
18:12:51:WU01:FS01:Upload 88.86%
18:12:57:WU01:FS01:Upload 89.84%
18:13:03:WU01:FS01:Upload 90.99%
18:13:05:WU00:FS01:0x22:Completed 425000 out of 2500000 steps (17%)
18:13:09:WU01:FS01:Upload 92.13%
18:13:15:WU01:FS01:Upload 93.44%
18:13:21:WU01:FS01:Upload 94.58%
18:13:27:WU01:FS01:Upload 95.72%
18:13:34:WU01:FS01:Upload 97.03%
18:13:36:WU00:FS01:0x22:Completed 450000 out of 2500000 steps (18%)
18:13:40:WU01:FS01:Upload 98.17%
18:13:46:WU01:FS01:Upload 99.48%
18:13:52:WU01:FS01:Upload complete
18:13:52:WU01:FS01:Server responded WORK_ACK (400)
18:13:52:WU01:FS01:Final credit estimate, 1092647.00 points
18:13:52:WU01:FS01:Cleaning up

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Sun Jun 04, 2023 6:47 pm
by Joe_H
Let me know your client's IP address and I will pass that on with a link to the screen shot of the report.

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Sun Jun 04, 2023 10:58 pm
by Lazvon
PM sent

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Mon Jun 05, 2023 3:33 am
by BobWilliams757
Much less delay for this one. I did have one take about 6 minutes to upload the other day, but before the mention of the software upgrade. Also, in my case I've never noticed the download delays other than very minor.

Code: Select all

  
14:05:56:WU00:FS02:Connecting to 66.170.111.50:8080
14:05:58:WU01:FS02:0x22:Checkpoint completed at step 1250000
14:06:07:WU00:FS02:Downloading 42.69MiB

14:06:13:WU00:FS02:Download 6.44%
14:06:19:WU00:FS02:Download 13.91%

14:06:22:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:16706 run:51 clone:5 gen:112 core:0x22 unit:0x00000005000000700000414200000033
14:06:22:WU01:FS02:Uploading 65.24MiB to 66.170.111.50
14:06:22:WU01:FS02:Connecting to 66.170.111.50:8080
14:06:25:WU00:FS02:Download 21.23%
14:06:28:WU01:FS02:Upload 2.01%
14:06:31:WU00:FS02:Download 28.40%
14:06:34:WU01:FS02:Upload 6.71%
14:06:37:WU00:FS02:Download 35.72%
14:06:40:WU01:FS02:Upload 11.69%
14:06:43:WU00:FS02:Download 42.90%
14:06:46:WU01:FS02:Upload 16.57%
14:06:49:WU00:FS02:Download 50.07%
14:06:52:WU01:FS02:Upload 21.46%
14:06:55:WU00:FS02:Download 57.25%
14:06:58:WU01:FS02:Upload 26.35%
14:07:01:WU00:FS02:Download 64.42%
14:07:04:WU01:FS02:Upload 31.23%
14:07:07:WU00:FS02:Download 71.59%
14:07:10:WU01:FS02:Upload 36.12%
14:07:13:WU00:FS02:Download 78.92%
14:07:16:WU01:FS02:Upload 41.10%
14:07:19:WU00:FS02:Download 86.09%
14:07:22:WU01:FS02:Upload 45.98%
14:07:25:WU00:FS02:Download 93.26%
14:07:28:WU01:FS02:Upload 50.87%
14:07:30:WU00:FS02:Download complete
14:07:30:WU00:FS02:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:16576 run:2 clone:173 gen:56 core:0x22 unit:0x38000000ad00000002000000c0400000

14:07:40:WU01:FS02:Upload 60.64%
14:07:46:WU01:FS02:Upload 65.62%
14:07:52:WU01:FS02:Upload 70.61%
14:07:58:WU01:FS02:Upload 75.59%
14:08:04:WU01:FS02:Upload 80.57%
14:08:10:WU01:FS02:Upload 85.45%
14:08:16:WU01:FS02:Upload 90.34%
14:08:22:WU01:FS02:Upload 95.32%
14:08:30:WARNING:WU01:FS02:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
14:08:30:WU01:FS02:Trying to send results to collection server
14:08:30:WU01:FS02:Uploading 65.24MiB to 128.174.73.74
14:08:30:WU01:FS02:Connecting to 128.174.73.74:8080

14:08:36:WU01:FS02:Upload 13.32%
14:08:42:WU01:FS02:Upload 52.88%
14:08:48:WU01:FS02:Upload 97.43%
14:08:54:WU01:FS02:Upload complete
 

Re: Upload to 66.170.111.50:8080 failing repeatedly

Posted: Mon Jun 05, 2023 1:24 pm
by jcoffland
I've upgraded the Linux kernel on that server and installed the optimized Work Server code. This has improved the situation somewhat but I'm still seeing some slow connections.

I tried altering the TCP send/receive buffer sizes and played around with the TCP window size but none of that seemed to help. I've now captured a couple of hours of packet data on the server and am working on analysing it to try to discover the problem. The first major question is, does the problem lie in a) the WS software, b) the Linux kernel or c) the VMWare network.

The server side packet capture also shows a lot of "TCP Dup ACK", "TCP Retransmission", and "TCP Out-Of-Order" packets. I've got a dump file with the payload removed and all the IP addresses randomized in case someone else would like to look at it. It's 600MiB compressed and contains 36M packets.