esp8266: having trouble with the net.connect command hanging

Discussion of issues related to writing ZBasic applications for targets other than ZX devices, i.e. generic targets.
Post Reply
JimG
Posts: 17
Joined: 24 March 2017, 6:15 AM

esp8266: having trouble with the net.connect command hanging

Post by JimG »

occasionally, maybe 1 out of four times, the net.connect command never returns to my program. The timeout doesn't trigger and I can find no way for the program to get back control when it happens.

Here is a sample debug output showing that the status of everything is the same, and I can find no reason for the stoppage-

Code: Select all

count=76, state=2, WiFiStatus=5 handle=0, NetStatus(handle=0)=0x0003=valid,open
in ConnectToHost, handle=0, NetStatus(handle=0)=0x0003=valid,open, host=216.229.0.179
 trying net.connect command with
handle=0, host=3003180504=216.229.0.179, port=13, protocol=0=TCP
in netCallback(), handle=0 cb: connect
Net.Connect() return=0, NetStatus(handle=0)=0x000f=valid,open,connected,espclient
the remote host IP=216.229.0.179 port=13 protocol=TCP


count=86, state=2, WiFiStatus=5 handle=0, NetStatus(handle=0)=0x0003=valid,open
in ConnectToHost, handle=0, NetStatus(handle=0)=0x0003=valid,open, host=216.229.0.179
 trying net.connect command with
handle=0, host=3003180504=216.229.0.179, port=13, protocol=0=TCP
in netCallback(), handle=0 cb: connect
Net.Connect() return=0, NetStatus(handle=0)=0x000f=valid,open,connected,espclient
the remote host IP=216.229.0.179 port=13 protocol=TCP


count=107, state=2, WiFiStatus=5 handle=0, NetStatus(handle=0)=0x0003=valid,open
in ConnectToHost, handle=0, NetStatus(handle=0)=0x0003=valid,open, host=216.229.0.179
 trying net.connect command with
handle=0, host=3003180504=216.229.0.179, port=13, protocol=0=TCP
As you can see it worked twice and stopped on the third time. Occasionally it will go to the netCallback routine with a code of disconnect, but not often.

I'll attach the program I'm using, but it's a real mess with all the debugging stuff I've added. It started out as the example ClientTCP program.
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Re: esp8266: having trouble with the net.connect command han

Post by dkinzer »

JimG wrote:occasionally [...] the net.connect command never returns to my program.
This is one of the worst types of problems to attempt to diagnose and I don't have much in the way of advice to tackle it. There have been quite a few reports of "flakiness" in the ESP8266 platform (among those who aren't using ZBasic) and most of the advice that I've seen suggests that it could be a power supply issue (marginal or noisy).
- Don Kinzer
JimG
Posts: 17
Joined: 24 March 2017, 6:15 AM

Post by JimG »

Thank you for your ideas. I tried several things including using two D-cells as a power source, which should have many times the current required and zero noise. Still getting the same problems.

I'm keeping stats on results from the various servers one gets from the time.nist.gov pool, and most of the problems are coming from a few of the servers. May possibly have to exclude some from usage, but it seems like a way to kludgy solution.

Another thing I don't understand is why the watchdog timer isn't resetting the chip after its locked up and unresponsive.

I hope there will be some ZBasic support for the esp32's. It seems like they should solve a lot of problems.
JimG
Posts: 17
Joined: 24 March 2017, 6:15 AM

Post by JimG »

Does the file ZBasic\xml\ZBasic.xml have an effect on anything?

I was trying to find out what could possible be wrong with the net.connect command since the program occasionally comes to a halt when issuing the command and it is not related anything in my control electronically.

The file contains the line:
<BuiltIn>Net.Connect(ByVal chan as AnyIntegralByVal ipAddr as UnsignedLong, ByVal port as AnyIntegral, ByVal proto as AnyIntegral)</BuiltIn>

which is missing a comma.

------------------------------------------------------------------------------

Also, I'm trying to use the command

call Net.SetTimeout(handle,10)

to set a maximum time for the net.connect command before giving up, but I get the error-

Error: one or more errors occurred in the back-end build process for "ClientTCP.esp"
>Exit code: 1

when I try to compile.
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

JimG wrote:Does the file ZBasic\xml\ZBasic.xml have an effect on anything?
That file is used by the IDE for providing parameter information when a call statement is being typed. You can edit it to add the missing comma.
JimG wrote:Error: one or more errors occurred in the back-end build process for "ClientTCP.esp"
If you add the line below to your .pjt file (near the top) more information will be emitted including the actual error messages from the back-end compilation and linking process. That information will be useful to diagnose the problem.

Code: Select all

--verbose
- Don Kinzer
JimG
Posts: 17
Joined: 24 March 2017, 6:15 AM

Post by JimG »

Here's what came out-

Code: Select all

>"F&#58;\ZBasic\zbasic.exe"  --target-device=ESP8266 --directory="F&#58;\ZBasic\progs\testtcp/" --project="ClientTCP.pjt"
make&#58; Entering directory `F&#58;/ZBasic/progs/testtcp/zb_hHW8Nu'
xtensa-lx106-elf-gcc -c -DZBASIC_APP -DZB_MOD_NAME=ClientTCP -I"F&#58;/ZBasic/zlib" -I. -IF&#58;/ZBasic/xtensa-lx106-elf/include -IF&#58;/ZBasic/xtensa-lx106-elf/ESP8266_SDK/include -D__ets__ -DICACHE_FLASH -gdwarf-2 -O2 -Wpointer-arith -Wundef -Werror -Wno-unused-variable -Wno-unused-but-set-variable -fno-strict-aliasing -falign-functions=4 -mlongcalls -mtext-section-literals ClientTCP.c -o ClientTCP.o.tmp
xtensa-lx106-elf-objcopy --rename-section .text=.irom0.text --rename-section .literal=.irom0.literal ClientTCP.o.tmp ClientTCP.o
rm -f ClientTCP.o.tmp
rm -f ClientTCP.elf ClientTCP_0x00000.bin ClientTCP_0x11000.bin
xtensa-lx106-elf-gcc -o ClientTCP.elf -nostdlib -LF&#58;/ZBasic/xtensa-lx106-elf/ESP8266_SDK/lib -Wl,-L,"F&#58;/ZBasic/zlib/esp8266/ldscripts" -Wl,-T,ClientTCP.ld -Wl,--no-check-sections -u call_user_start -Wl,-static   -Wl,--defsym,zbSendByte=zbSendByteHW -Wl,--defsym,zbGetChar=zbGetCharHW "F&#58;/ZBasic/zlib/esp8266/zbasic_xtensa.o"  ClientTCP.o   -Wl,--start-group "F&#58;/ZBasic/zlib/esp8266/lib/libST_esp8266.a" "F&#58;/ZBasic/zlib/esp8266/lib/libesp8266.a" -lc -lgcc -lhal -lpp -lphy -lnet80211 -llwip -lcrypto -lwpa -lmain -lm -Wl,--end-group
ClientTCP.o&#58; In function `zf_netstat'&#58;
F&#58;\ZBasic\progs\testtcp\zb_hHW8Nu/ClientTCP.c&#58;409&#58; undefined reference to `zbNetTimeout'
F&#58;\ZBasic\progs\testtcp\zb_hHW8Nu/ClientTCP.c&#58;426&#58; undefined reference to `zbNetTimeout'
collect2.exe&#58; error&#58; ld returned 1 exit status
make&#58; *** &#91;ClientTCP.elf&#93; Error 1
make&#58; Leaving directory `F&#58;/ZBasic/progs/testtcp/zb_hHW8Nu'
Error&#58; one or more errors occurred in the back-end build process for "ClientTCP.esp"
>Exit code&#58; 1
dkinzer
Site Admin
Posts: 3120
Joined: 03 September 2005, 13:53 PM
Location: Portland, OR

Post by dkinzer »

There is an updated ZX Library (for the ESP8266 only) at the URL below that should fix this issue. You should extract the files in the archive to the zlib subdirectory of the ZBasic installation directory, preserving the paths in the filenames.

http://www.zbasic.net/download/zlib/4.3 ... sp8266.zip
- Don Kinzer
JimG
Posts: 17
Joined: 24 March 2017, 6:15 AM

Post by JimG »

It works. Thanks. I know us freeby users are not your top priority, I appreciate the prompt response.

Except for a few times when the watchdog timer kicked in for unknown reasons, preceding Net.Connect with Net.SetTimeout now causes connect to return in a period of time rather than just going away forever. It's quite usable even if a little weird.
After many trial runs--- in Net.SetTimeout(handle, timeout), the timeout value has the following results:
(Note these are estimates based upon looking at a clock, not with a timer)

Code: Select all

Value   approx Timeout
 10      instant
 20      1/2 second
 30     1 second
 40     2 seconds
 50     3 seconds, with occasional 20 second timeout
 75     5 seconds
100   10 seconds
200   40 seconds
I am not getting a callback to my Net.SetCallback routine, Net.Connect just returns. The return values are 0 for good, and -3 for no connect within the allotted time.
JimG
Posts: 17
Joined: 24 March 2017, 6:15 AM

Post by JimG »

So I did the timing tests over with code to do the timing.
Here is the quicky and messy code section-

Code: Select all

			dim constart as unsignedlong
			dim conend   as unsignedlong
			dim condel as unsignedlong
			dim contime  as unsignedlong
			dim csecs as Single
			dim tps as unsignedlong
			dim timeout as integer = 75
			dim timeoutflt as single
			timeoutflt=csng&#40;timeout&#41;
			tps=FRC2_ticksPerSec\100  ' ticks in one hundredth of a second
			Call Net.SetTimeout&#40;handle,timeout&#41;
			debug.print "in ConnectOnly, status=";netstat&#40;handle&#41;
			constart = Register.FRC2_COUNT
			stat = Net.Connect&#40;handle, hostAddr, port, protocol&#41;  ' port 13,   0=tcp, 1=udp
			conend = Register.FRC2_COUNT
			condel = conend-constart
			contime = condel\tps    ' hundreths of a second
			csecs=csng&#40;contime&#41;
			csecs=csecs/100.0
			Debug.Print "Net.Connect&#40;&#41; return="; stat; ", ";netstat&#40;handle&#41;;
			debug.print "       testval=";timeout;" secs=";csecs;" secs/testval-unit=";cstr&#40;csecs/timeoutflt&#41;
and here is the results. first column is value fed to Net.SetTimeout, second column is the time in seconds that Net.Connect got to run before being interrupted by timeout, and the third column is the seconds/unit of timeout value fed to settimeout.

Code: Select all

count  secs    seconds/count
20      .4     .02
30      .9     .03
40      1.6    .04
50      2.5    .05
60      3.6    .06
75      5.62   .075
100    10.0    .1
150    22.5    .15
200    40.0    .2

Odd how the seconds/count is equal to the count/1000, it can't be a coincidence. Not what you would call linear, but still usable if you know what you want the end result to be.
Post Reply