Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force synchronization of the serial port? #46

Open
ijustlovemath opened this issue Mar 24, 2022 · 18 comments · May be fixed by #47
Open

Force synchronization of the serial port? #46

ijustlovemath opened this issue Mar 24, 2022 · 18 comments · May be fixed by #47

Comments

@ijustlovemath
Copy link

I'm running into a bizarre issue with this serial port library that I also saw in a Rust based library, maybe you can help!

I'm implemented the following test program, which just writes a command to a little Arduino I have connected over USB:

  3 import serial, streams
  4 
  5 proc main() =
  6     let port = newSerialStream("/dev/ttyUSB0"
  7         , 115200
  8         , Parity.None
  9         , 8
 10         , StopBits.One
 11         , Handshake.RequestToSend
 12         , buffered=false
 13     )
 14 
 15     defer: close(port)
 16 
 17     port.setTimeouts(1000,100)
 18 
 19 #    for i in 0..3:
 20 #        echo "tryna read"
 21 #        echo port.readline()
 22     port.write("volume up\n")
 23     port.flush()
 24 
 25 
 26 when isMainModule:
 27     main()

What's strange is that the Arduino receives the data immediately (evidenced by the RX LED flashing), but does not act on it until I force the program to halt. It does not halt unless I send it SIGINT (Control-C); so it appears to be hung up on closing the serial port?

This may be related to a strange RTS/CTS related issue I was seeing on my scope, where the sending party (my computer) would hold the RTS line, most likely triggering an interrupt and preventing the Arduino from processing the command until the port was forcibly closed/flushed by the OS.

Let me know if you have any suggestions, I'm all ears! So far all I've done is try all of the handshakes, and what it looks like is:

None: data is received by Arduino but it does not act on it; program terminates normally
RTS: data is received by Arduino and it does act on it but not until program is forcibly killed
XOnXOff: same as RTS
RTSXOnXOff: same as RTS

The timeouts are generous enough, as this whole transaction takes <10ms on my scope.

@euantorano
Copy link
Owner

euantorano commented Mar 24, 2022 via email

@euantorano
Copy link
Owner

euantorano commented Mar 24, 2022 via email

@ijustlovemath
Copy link
Author

Wow, thanks for the blistering fast reply! I'm doing a refactor to try newSerialPort, whose constructor only currently supports the device file name.

I'm on Arch, but have seen similar issues on other Unices. What's really interesting is that eventually the posix.close() call succeeds, but takes a long time. I think if we can figure out the problem we could probably improve both your and Rust's library!

If you're interested, I've got strace logs from the Rust and (functional) Arduino IDE interactions with the port, here: https://github.com/ijustlovemath/arduino-remote/tree/master/logs

@ijustlovemath
Copy link
Author

Seeing the same issue with the raw serial port, unfortunately!

@ijustlovemath
Copy link
Author

I've added the (nonfunctional) Nim and (functional) pySerial strace logs to the directory. Check out the README within for a brief summary of each.

@euantorano
Copy link
Owner

euantorano commented Mar 25, 2022 via email

@euantorano
Copy link
Owner

euantorano commented Mar 25, 2022

I’ve had a quick glance at the pyserial source and their flush method seems to be a bit different to ours:

They use tcdrain whereas we use tcflush. It looks like tcdrain is the correct function: https://linux.die.net/man/3/tcdrain

I’ll need to check the flags we set for termios too, but it might be a simple fix 🤞

@ijustlovemath
Copy link
Author

Happy to help more where I can! Really appreciate you being on it, maintainers like you are what make OSS so great!

@euantorano
Copy link
Owner

euantorano commented Mar 28, 2022

@ijustlovemath any chance you can try out the changes in #47?

If you're using Nimble, you should be able to require the branch like follows I believe (in your Nimble file):

requires "serial#fix/46-flush"

My testing from a Mac host to a Windows client has worked perfectly, but need to double check on other hardware as I was using FTDI serial to USB adapters on both ends.

@ijustlovemath
Copy link
Author

ijustlovemath commented Mar 29, 2022

Hi, sorry for the nooby question, but it's saying:

"Unexpected char in version range 'flush'"

Did i not put it in the nimble file correctly?

@ijustlovemath
Copy link
Author

Was able to work around using commit hash; maybe it's an issue with using a - or / in your branch name?

@ijustlovemath
Copy link
Author

Okay, so this is very strange. I went to run the tests again, and it still hangs with RTS/CTS, still exits with no handshaking, data still gets written, but when I tried to write to it (by using serialStream:writeLine), I get the error:

Error: expression 'writeLine(port, ["volume up\n"])' has no type (or is ambiguous)

Going to continue tweaking and see if i can't get a conclusive yes or no for you!

@ijustlovemath
Copy link
Author

ijustlovemath commented Mar 29, 2022

Sorry for the slow progress on this, brand new to Nim and still figuring things out! I found out that the reason for the "no type (or is ambiguous)" error is because I was discarding the return value from write(), but I guess it picked a write proc that has no return? I'll test on hardware ASAP and post results (though yesterday I was getting reads and writes with some weird pagination, which is progress!)

Here's the current attempt if you wanna try it on your setup: https://github.com/ijustlovemath/arduino-remote/blob/a6ee24efc85d87191be0df054fb945f8567a8650/nim/arduino_remote/src/arduino_remote.nim

@euantorano
Copy link
Owner

I was just about to ask if you had some code available! I'll try and test this after dinner tonight and see what happens. If I spot any improvements I'll send you a PR.

@ijustlovemath
Copy link
Author

Sorry this has taken so much of your time, I appreciate your continued help! Let me know if there's anywhere I can donate to send you a coffee or a beer.

I've done a little more investigation and it looks like the original problem of hanging when closing is caused on my system by serialport_posix.nim:848. The behavior is this:

Using RTS handshake. When trying to flush, SIGINT is ignored for many seconds while flush is attempted. After it finally fails, close() succeeds and the data is actually received and acted upon (volume goes up on my TV).

What's weird is that, when I comment out the final flush, it still hangs, but responds immediately to SIGINT, flushing the data and actually closing.

So I think it all comes back to this: why does flushing and calling posix.close() cause this to hang?

Once we know that, the fix should be a few lines!

@ijustlovemath
Copy link
Author

ijustlovemath commented Mar 31, 2022

Apologies for the issue spam, but I have some new findings! Let me know if you want me to lay off until I've made significant progress, but this is a step in the right direction!

I've reran my strace with --abbrev=none --verbose=all, which allows you to fully inspect the contents of termios structs. I found some interesting differences between pySerial and this library, which may be worth investigating:

pySerial

1540524 12:54:02 ioctl(3, SNDCTL_TMR_START or TCSETS, 
    {c_iflags=0
    , c_oflags=0
    , c_cflags=0x1cb2
    , c_lflags=0
    , c_line=0
    , c_cc[VMIN]=0
    , c_cc[VTIME]=0
    , c_cc="\x03\x1c\x7f\x15
            \x04\x00\x00\x00
            \x11\x13\x1a\x00
            \x12\x0f\x17\x16
            \x00\x00\x00"}) = 0

serial.nim

 55 1209461 18:54:24 ioctl(3, SNDCTL_TMR_START or TCSETS, 
    {c_iflags=0
    , c_oflags=0
    , c_cflags=0x1cb2
    , c_lflags=0xa00 // should be 0 not 0xa00
    , c_line=0
    , c_cc[VMIN]=1 // should be 0 not 1
    , c_cc[VTIME]=0
    , c_cc="\x03\x1c\x7f\x15
            \x04\x00\x01\x00  // [3] == 0x01, should be 0x00
            \x11\x13\x1a\x00
            \x12\x0f\x17\x16
            \x00\x00\x00"}) = 0

To summarize:

  • pySerial clears cc_lflags
  • pySerial uses a vmin (minimum bytes at the port) of 0
  • pySerial uses a minorly different set of control characters (byte at index 6 is 0x00) (this is actually VMIN)

I'll see what I can do about changing this and retesting. Would you be open to my adding an iflush(), oflush(), and drain() procs as wrappers for tcflush(..., TCIFLUSH), tcflush(..., TCOFLUSH), tcdrain() ? This would allow future users of this library more fine grained control over how these ports are used.

@euantorano
Copy link
Owner

That's very helpful, thanks. I'm tied up with work today and tomorrow but will take another look at this over the weekend. I suspected the termios struct might be slightly different, but I encountered some issues with undefined constants from Nim's termios wrapper which I need to find a solution for.

@euantorano
Copy link
Owner

Would you be open to my adding an iflush(), oflush(), and drain() procs as wrappers for tcflush(..., TCIFLUSH), tcflush(..., TCOFLUSH), tcdrain() ? This would allow future users of this library more fine grained control over how these ports are used.

Yes, I need to see if there is a Windows equivalent at all or if they'll just be null ops there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants