Page 1 of 4

Collection Server 140.163.4.200:8080

Posted: Sat Jul 24, 2021 8:41 pm
by Craig
Completed WUs sent to the server above seem to stop uploading on the last bits of the file. I was thinking it was only the larger size files worth 300K points or more but I just had one with less than half that now. I had been able to clear them up by rebooting and trying to get a different server. I have another PC running FAH that gets much larger file sizes and I haven't had any problem with that one but I'm not sure it has hit the server in question. The error I get is, "20:08:56:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0". It seems to this old fart that there is a problem somewhere and to me it "feels" like it is the server but I really have no idea. If I have problems with a different server I'll update this post. Thanks to all of you and the others on your team for all the work they do! Let me know if there is anything I can do to help!

Code: Select all

*********************** Log Started 2021-07-24T19:50:29Z ***********************
19:50:29:******************************* libFAH ********************************
19:50:29:           Date: Oct 20 2020
19:50:29:           Time: 13:36:55
19:50:29:       Revision: 5ca109d295a6245e2a2f590b3d0085ad5e567aeb
19:50:29:         Branch: master
19:50:29:       Compiler: Visual C++ 2015
19:50:29:        Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
19:50:29:       Platform: win32 10
19:50:29:           Bits: 32
19:50:29:           Mode: Release
19:50:29:****************************** FAHClient ******************************
19:50:29:        Version: 7.6.21
19:50:29:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
19:50:29:      Copyright: 2020 foldingathome.org
19:50:29:       Homepage: https://foldingathome.org/
19:50:29:           Date: Oct 20 2020
19:50:29:           Time: 13:41:04
19:50:29:       Revision: 6efbf0e138e22d3963e6a291f78dcb9c6422a278
19:50:29:         Branch: master
19:50:29:       Compiler: Visual C++ 2015
19:50:29:        Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
19:50:29:       Platform: win32 10
19:50:29:           Bits: 32
19:50:29:           Mode: Release
19:50:29:         Config: C:\ProgramData\FAHClient\config.xml
19:50:29:******************************** CBang ********************************
19:50:29:           Date: Oct 20 2020
19:50:29:           Time: 11:36:18
19:50:29:       Revision: 7e4ce85225d7eaeb775e87c31740181ca603de60
19:50:29:         Branch: master
19:50:29:       Compiler: Visual C++ 2015
19:50:29:        Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Zc:throwingNew /MT
19:50:29:       Platform: win32 10
19:50:29:           Bits: 32
19:50:29:           Mode: Release
19:50:29:******************************* System ********************************
19:50:29:            CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
19:50:29:         CPU ID: GenuineIntel Family 6 Model 42 Stepping 7
19:50:29:           CPUs: 4
19:50:29:         Memory: 15.98GiB
19:50:29:    Free Memory: 13.62GiB
19:50:29:        Threads: WINDOWS_THREADS
19:50:29:     OS Version: 6.1
19:50:29:    Has Battery: false
19:50:29:     On Battery: false
19:50:29:     UTC Offset: -4
19:50:29:            PID: 5604
19:50:29:            CWD: C:\ProgramData\FAHClient
19:50:29:  Win32 Service: false
19:50:29:             OS: Windows 7 Home Premium
19:50:29:        OS Arch: AMD64
19:50:29:           GPUs: 1
19:50:29:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 TU106 [GeForce RTX 2060 SUPER]
19:50:29:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:10.2
19:50:29:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:442.19
19:50:29:***********************************************************************
19:50:29:<config>
19:50:29:  <!-- Network -->
19:50:29:  <proxy v=':8080'/>
19:50:29:
19:50:29:  <!-- Slot Control -->
19:50:29:  <power v='full'/>
19:50:29:
19:50:29:  <!-- User Information -->
19:50:29:  <passkey v='*****'/>
19:50:29:  <team v='261242'/>
19:50:29:  <user v='Craig-WGHS-1971'/>
19:50:29:
19:50:29:  <!-- Folding Slots -->
19:50:29:  <slot id='0' type='CPU'>
19:50:29:    <paused v='true'/>
19:50:29:  </slot>
19:50:29:  <slot id='1' type='GPU'>
19:50:29:    <paused v='true'/>
19:50:29:    <pci-bus v='1'/>
19:50:29:    <pci-slot v='0'/>
19:50:29:  </slot>
19:50:29:</config>
19:50:29:Trying to access database...
19:50:29:Successfully acquired database lock
19:50:29:FS00:Initialized folding slot 00: cpu:3
19:50:29:FS01:Initialized folding slot 01: gpu:1:0 TU106 [GeForce RTX 2060 SUPER]
19:50:29:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e
19:50:29:WU01:FS01:Uploading 10.85MiB to 140.163.4.200
19:50:29:WU01:FS01:Connecting to 140.163.4.200:8080
19:51:02:WARNING:WU01:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
19:51:02:WU01:FS01:Trying to send results to collection server
19:51:02:WU01:FS01:Uploading 10.85MiB to 140.163.4.210
19:51:02:WU01:FS01:Connecting to 140.163.4.210:8080
19:51:33:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
19:51:33:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e
19:51:33:WU01:FS01:Uploading 10.85MiB to 140.163.4.200
19:51:33:WU01:FS01:Connecting to 140.163.4.200:8080
19:52:04:WARNING:WU01:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
19:52:04:WU01:FS01:Trying to send results to collection server
19:52:04:WU01:FS01:Uploading 10.85MiB to 140.163.4.210
19:52:04:WU01:FS01:Connecting to 140.163.4.210:8080
19:52:35:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
19:52:35:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e
19:52:35:WU01:FS01:Uploading 10.85MiB to 140.163.4.200
19:52:35:WU01:FS01:Connecting to 140.163.4.200:8080
19:53:06:WARNING:WU01:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
19:53:06:WU01:FS01:Trying to send results to collection server
19:53:06:WU01:FS01:Uploading 10.85MiB to 140.163.4.210
19:53:06:WU01:FS01:Connecting to 140.163.4.210:8080
19:53:37:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
19:54:12:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e
19:54:12:WU01:FS01:Uploading 10.85MiB to 140.163.4.200
19:54:12:WU01:FS01:Connecting to 140.163.4.200:8080
19:54:43:WARNING:WU01:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
19:54:43:WU01:FS01:Trying to send results to collection server
19:54:43:WU01:FS01:Uploading 10.85MiB to 140.163.4.210
19:54:43:WU01:FS01:Connecting to 140.163.4.210:8080
19:55:14:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
19:56:49:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e
19:56:49:WU01:FS01:Uploading 10.85MiB to 140.163.4.200
19:56:49:WU01:FS01:Connecting to 140.163.4.200:8080
19:57:20:WARNING:WU01:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
19:57:20:WU01:FS01:Trying to send results to collection server
19:57:20:WU01:FS01:Uploading 10.85MiB to 140.163.4.210
19:57:20:WU01:FS01:Connecting to 140.163.4.210:8080
19:57:51:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
20:01:03:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e
20:01:03:WU01:FS01:Uploading 10.85MiB to 140.163.4.200
20:01:03:WU01:FS01:Connecting to 140.163.4.200:8080
20:01:34:WARNING:WU01:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
20:01:34:WU01:FS01:Trying to send results to collection server
20:01:34:WU01:FS01:Uploading 10.85MiB to 140.163.4.210
20:01:34:WU01:FS01:Connecting to 140.163.4.210:8080
20:02:05:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
20:07:55:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e
20:07:55:WU01:FS01:Uploading 10.85MiB to 140.163.4.200
20:07:55:WU01:FS01:Connecting to 140.163.4.200:8080
20:08:26:WARNING:WU01:FS01:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
20:08:26:WU01:FS01:Trying to send results to collection server
20:08:26:WU01:FS01:Uploading 10.85MiB to 140.163.4.210
20:08:26:WU01:FS01:Connecting to 140.163.4.210:8080
20:08:56:ERROR:WU01:FS01:Exception: 10002: Received short response, expected 512 bytes, got 0
Regards,

Craig

Re: Collection Server 140.163.4.200:8080

Posted: Sat Jul 24, 2021 9:22 pm
by debs3759
When I get that error, the wu has usually uploaded (it has every time I checked) and full points assigned. The server just hasn't responded.

Re: Collection Server 140.163.4.200:8080

Posted: Sun Jul 25, 2021 12:25 am
by Craig
That's interesting. I haven't tried to check to see if it actually shows up if I don't reboot. I just see it sitting there sometimes for hours before I notice it. Do your WUs just sit there for a long time even after you've collected the points? Do they ever go away? I am fairly sure when I do the reboot it doesn't give the full points but I could be mistaken.

Thanks for your response!

Re: Collection Server 140.163.4.200:8080

Posted: Sun Jul 25, 2021 2:26 am
by debs3759
I usually check whether the wu has registered, then delete it once I have confirmed it, so unsure whether the server eventually returns anything. It may not be so easy once I start folding on PCs with no monitor attached (it's too complicated for me to set up fahcontrol to read my work on other PCs)

Re: Collection Server 140.163.4.200:8080

Posted: Sun Jul 25, 2021 5:58 am
by aetch
Check your work unit yourself.
You just need to find the line in your logs which contains the relevant project/run/clone/gen data. It typically appears when you're sending results, have received a new work unit or are starting a work unit.
In your case it's this line - "19:50:29:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:17602 run:78 clone:8 gen:6 core:0x22 unit:0x0000000800000006000044c20000004e"

https://apps.foldingathome.org/wu

Caveat - sometimes the work/collection servers don't update the statistics servers in a timely manner so even this check can fail but if there's an entry for your work unit beside your folding donor name then you're golden.

Re: Collection Server 140.163.4.200:8080

Posted: Sun Jul 25, 2021 10:10 am
by toTOW
You probably have Kaspersky or something like this ... disable any deep packet inspection feature (or something similar) for FAH.

If you have Discord, there's some messages there talking about it : https://discord.com/channels/5738706890 ... 8824171601

Re: Collection Server 140.163.4.200:8080

Posted: Sun Jul 25, 2021 2:38 pm
by Craig
aetch wrote:Check your work unit yourself.
OK, good information if I'm understanding it correctly!!!

I pasted that long string into the WU Status app you linked to and it shows it was credited at 2021-07-24 18:05:43. Was the WU actually credited at 18:05:43 and my app was still trying to upload at 20:08:56 because the data hadn't been updated on the Statistics server?

Were the multiple times it kept trying to resend the WU and getting Errors showing short responses for hours and receiving the short response error were because the Statistics servers that Collection server consulted hadn't been updated and everything else had worked as it was supposed to and I'm re-uploading and rebooting for no reason? Did I just happened to get another server and after reuploading the WU to that server completely with no Error message it only said that because it had consulted a different Statistics Server (or the original server had just been updated) that been updated and showed the WU was complete?

I hope you can make some sense out of this mess I just typed, sorry and a big thanks for your help!!!

Re: Collection Server 140.163.4.200:8080

Posted: Mon Jul 26, 2021 9:15 pm
by gunnarre
Craig wrote:Was the WU actually credited at 18:05:43 and my app was still trying to upload at 20:08:56 because the data hadn't been updated on the Statistics server?
No, the client doesn't check the statistics server. Something - either your antivirus, your router, your ISP, or a network in between - cut off the response from the server to your client, which means your client will try to upload the WU again until it reaches time of expiry. The statistics server does not enter into that transaction.

Whitelisting the client in the antivirus might be the solution, or if it works when you tether the computer via the cellphone network or connect via VPN, then the problem lies with your ISP.

A missing repsonse to the client, and a missing update to the statistics server are two distinct problems which sometimes occur and they are not related - one might work fine without the other working.

Re: Collection Server 140.163.4.200:8080

Posted: Tue Jul 27, 2021 1:24 am
by debs3759
gunnarre wrote:
Craig wrote:Was the WU actually credited at 18:05:43 and my app was still trying to upload at 20:08:56 because the data hadn't been updated on the Statistics server?
No, the client doesn't check the statistics server. Something - either your antivirus, your router, your ISP, or a network in between - cut off the response from the server to your client, which means your client will try to upload the WU again until it reaches time of expiry. The statistics server does not enter into that transaction.

Whitelisting the client in the antivirus might be the solution, or if it works when you tether the computer via the cellphone network or connect via VPN, then the problem lies with your ISP.

A missing repsonse to the client, and a missing update to the statistics server are two distinct problems which sometimes occur and they are not related - one might work fine without the other working.
I get the same error sometimes, but as the server eventually replies, it seems to me that the problem is at the server end, possibly because it is overloaded and replies too late. If anything was cutting it off, the problem would be there every time I upload to the same server, and would never be resolved. Also, the upload would likely never start if the server didn't let the client know it was ready. How would the client know to send more packets if no response ever got through?

Re: Collection Server 140.163.4.200:8080

Posted: Tue Jul 27, 2021 4:30 am
by Neil-B
@debs3759 Whilst not the only thing that can cause this effect it has been shown in the past to be linked to various av (two specific ones spring to mind) when they inspect traffic (there is a setting linked to this) .. the way some av/firewalls work doesn't mean it is as simple/clear/crude as it will always block everything to a specific server- it is possible for the effect to be intermittent - sometimes project related, sometimes wu related, sometimes even network latency related .. if the cause were fully understood then a fix might happen but in the mean time the short response received error on an upload that actually works does sometimes appear to be av related and is worth checking if this problem occurs repeatedly (even if sporadically without obvious pattern)

My personal guess (and I mean guess) is that it is most likely a latency related issue (either av/firewall/isp/server response time) linked to a timeout in the software which may have been set too short when issues with servers meant they didn't send out responses and so the software needed a timeout - an unintended consequence of solving one obvious issue creating another much subtler one ... but as I said this is a wild guess and speculation on my part purely based on vague patterns I have observed

Re: Collection Server 140.163.4.200:8080

Posted: Tue Jul 27, 2021 8:03 pm
by aetch
Every internet security program should have an exclusions list, FAH recommend adding the FAH working directory to this list. This is to ensure the internet security program does not mistake the work units for malware.

Scroll down to "Note" at the end of the page.
Windows Requirements
Mac Requirements

They don't appear to make the same recommendation for Linux.

Re: Collection Server 140.163.4.200:8080

Posted: Tue Jul 27, 2021 8:40 pm
by Neil-B
The short response issue I have noticed isn't a folder exclusion issue .. it is related to traffic inspection settings

Re: Collection Server 140.163.4.200:8080

Posted: Tue Jul 27, 2021 9:19 pm
by aetch
I usually read the short response as a server overload issue.

The folder/file exclusion - I could be wrong but I think it's not just an exclusion of the contents of the folders but also of the traffic going to and coming from those folders.

Re: Collection Server 140.163.4.200:8080

Posted: Tue Jul 27, 2021 9:39 pm
by Neil-B
With BD and Ks people have reported that even with av/firewall "off" the inspection feature can cause issues .. and yup sometimes the short response may be server side but not always from what I have seen .. the message appears to occur whenever a response is not received within a set timeout .. a non responsive server may certainly cause this, as might isp related interference or av/firewall issues .. anything that delays or possible corrupts/blocks seems to come into the same error .. if it is server side everybody reports/complains about one server .. if just a few people having issues then the other options seem to come into play .. cannot say for sure but observation over time has led me to this hypothesis

Re: Collection Server 140.163.4.200:8080

Posted: Thu Jul 29, 2021 12:19 am
by Craig
Here is a bit of extra information and a link provided by aetch:
https://apps.foldingathome.org/wu

Apparently, my work units are being completed correctly the first time they are uploaded!!! This morning I had one that had kept uploading for over 7 hours but the link above gave me a response that basically said
"Hi Craig-WGHS-1971 (team 261242), Your WU was added to the stats database the first time it was uploaded and gave me the expected number of points.

As I've said before I'm an old fart I don't know have a clue how any of this stuff works but since some WUs upload without any problems whatsoever and some seem like they will just keep going forever would mean my Virus Protection or my firewall are complete crap. Granted this is an old PC I built running Windows 7 and just has a 2060 super GPU but the PC I built a few months ago I got lucky and got a 3090 GPU and it is running Windows 10 (which I hate with every fiber of my being LOL) also gets this error but only very occasionally.

I haven't actually confirmed this by checking points and WUs as I've got two PC running FAH both using the same User Name and Team. Trying to match 2 PC's log files to confirm points or WUs is too much work but if there is anyone that actually knows this stuff and is in a position to work with me and has some input to FAH and get them also involved to work them as well I'd be willing to move this PC to another Team or Name or whatever is necessary to check and help see if the root cause of this can be discovered and corrected because this problem seems more than just me and annoys the fuck out of me! Thanks guys!