Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Sync crashes/exits with core.time.TimeException: Invalid ISO Extended String #2813

Open
phlibi opened this issue Sep 17, 2024 · 36 comments · May be fixed by #2816
Open

Bug: Sync crashes/exits with core.time.TimeException: Invalid ISO Extended String #2813

phlibi opened this issue Sep 17, 2024 · 36 comments · May be fixed by #2816

Comments

@phlibi
Copy link

phlibi commented Sep 17, 2024

Describe the bug

I started actual use of the onedrive client today by synchronizing a few Sharepoint/O365 folders to my PC. This worked fine for one folder, another one is still running, but the third seems to crash the client due to a TimeException:

$ onedrive --monitor --confdir=/home/phip/.config/onedrive/swxxxin
D-Bus message bus daemon is available; GUI notifications are now enabled
Reading configuration file: /home/phip/.config/onedrive/swxxxin/config
Configuration file successfully loaded
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Configuring Global Azure AD Endpoints
OneDrive synchronisation interval (seconds): 300
Initialising filesystem inotify monitoring ...
Performing initial synchronisation to ensure consistent local state ...
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Starting a sync with Microsoft OneDrive
Fetching items from the OneDrive API for Drive ID: b!2iD..........nb-SaXtsp ..Attempting to perform a database vacuum to optimise database
Database vacuum is complete
core.time.TimeException@/usr/lib/ldc/x86_64-linux-gnu/include/d/std/datetime/systime.d(9101): Invalid ISO Extended String:  䁷�
----------------
??:? [0x55cd57f047b5]
??:? [0x55cd57f01441]
??:? [0x55cd57f01bb8]
??:? [0x55cd57f1c148]
??:? [0x55cd57f167bd]
??:? [0x55cd57f0ecd1]
??:? [0x55cd57f0d19b]
??:? [0x55cd57f9f05a]
??:? [0x55cd57f9c78e]
??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x7f17b1bc5b7b]
??:? _d_run_main2 [0x7f17b1bc5995]
??:? _d_run_main [0x7f17b1bc57ed]
??:? [0x7f17b1367249]
??:? __libc_start_main [0x7f17b1367304]
??:? [0x55cd57ee2220]

This happened after a lot of data was synchronized (about 16GB), probably everything that is in the folder. But now the process exits with this error message, also when started with --monitor.

Operating System Details

Debian Bookworm (12) with backports enabled
Linux phiptp 6.10.6+bpo-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.10.6-1~bpo12+1 (2024-08-26) x86_64 GNU/Linux

Client Installation Method

From 3rd Party Source (PPA, OpenSuSE Build Service etc)

OneDrive Account Type

SharePoint

What is your OneDrive Application Version

onedrive v2.5.0-1+np1+1.1

What is your OneDrive Application Configuration

For the offending run:

$ onedrive --display-config --confdir=~/.config/onedrive/swxxxin/
D-Bus message bus daemon is available; GUI notifications are now enabled
Reading configuration file: /home/phip/.config/onedrive/swxxxin/config
Configuration file successfully loaded
Application version                          = onedrive v2.5.0-1+np1+1.1
Compiled with                                = LDC 2100
User Application Config path                 = /home/phip/.config/onedrive/swxxxin/
System Application Config path               = /etc/onedrive
Applicable Application 'config' location     = /home/phip/.config/onedrive/swxxxin/config
Configuration file found in config location  = true - using 'config' file values to override application defaults
Applicable 'sync_list' location              = /home/phip/.config/onedrive/swxxxin/sync_list
Applicable 'items.sqlite3' location          = /home/phip/.config/onedrive/swxxxin/items.sqlite3
Config option 'drive_id'                     = b!2iD8jxd7F.....nb-SaXtsp
Config option 'sync_dir'                     = ~/phipsfiles/swxxx/swxxxin
Config option 'enable_logging'               = false
Config option 'log_dir'                      = /var/log/onedrive
Config option 'disable_notifications'        = false
Config option 'skip_dir'                     = 
Config option 'skip_dir_strict_match'        = false
Config option 'skip_file'                    = ~*|.~*|*.tmp|*.swp|*.partial
Config option 'skip_dotfiles'                = false
Config option 'skip_symlinks'                = true
Config option 'monitor_interval'             = 300
Config option 'monitor_log_frequency'        = 12
Config option 'monitor_fullscan_frequency'   = 12
Config option 'read_only_auth_scope'         = false
Config option 'dry_run'                      = false
Config option 'upload_only'                  = false
Config option 'download_only'                = false
Config option 'local_first'                  = false
Config option 'check_nosync'                 = false
Config option 'check_nomount'                = false
Config option 'resync'                       = false
Config option 'resync_auth'                  = false
Config option 'cleanup_local_files'          = false
Config option 'classify_as_big_delete'       = 1000
Config option 'disable_upload_validation'    = false
Config option 'disable_download_validation'  = true
Config option 'bypass_data_preservation'     = false
Config option 'no_remote_delete'             = false
Config option 'remove_source_files'          = false
Config option 'sync_dir_permissions'         = 700
Config option 'sync_file_permissions'        = 600
Config option 'space_reservation'            = 52428800
Config option 'application_id'               = d50ca740-c83f-4d1b-b616-12c519384f0c
Config option 'azure_ad_endpoint'            = 
Config option 'azure_tenant_id'              = 
Config option 'user_agent'                   = ISV|abraunegg|OneDrive Client for Linux/v2.5.0-1+np1+1.1
Config option 'force_http_11'                = false
Config option 'debug_https'                  = false
Config option 'rate_limit'                   = 0
Config option 'operation_timeout'            = 3600
Config option 'dns_timeout'                  = 60
Config option 'connect_timeout'              = 10
Config option 'data_timeout'                 = 60
Config option 'ip_protocol_version'          = 0
Config option 'threads'                      = 8
Environment var 'XDG_RUNTIME_DIR'            = true
Environment var 'DBUS_SESSION_BUS_ADDRESS'   = true
Config option 'notify_file_actions'          = false

Selective sync 'sync_list' configured        = false

Config option 'sync_business_shared_items'   = false

Config option 'webhook_enabled'              = false

What is your 'curl' version

curl 8.9.1 (x86_64-pc-linux-gnu) libcurl/8.9.1 GnuTLS/3.7.9 zlib/1.2.13 brotli/1.0.9 zstd/1.5.4 libidn2/2.3.3 libpsl/0.21.2 libssh2/1.10.0 nghttp2/1.52.0 ngtcp2/1.6.0 nghttp3/1.4.0 librtmp/2.3 OpenLDAP/2.5.13
Release-Date: 2024-07-31, security patched: 8.9.1-2~bpo12+1
Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTP3 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets zstd

Where is your 'sync_dir' located

Local

What are all your system 'mount points'

sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=15715820k,nr_inodes=3928955,mode=755,inode64)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,relatime,size=3152504k,mode=755,inode64)
zroot/ROOT/debian on / type zfs (rw,relatime,xattr,noacl,casesensitive)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,inode64)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k,inode64)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=11726)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,nosuid,nodev,relatime,pagesize=2M)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
none on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
zroot on /zroot type zfs (rw,relatime,xattr,noacl,casesensitive)
zroot/data/home on /home type zfs (rw,relatime,xattr,noacl,casesensitive)
zroot/data/scratch on /scratch type zfs (rw,relatime,xattr,noacl,casesensitive)
zroot/data/home/phipsfiles on /home/phip/phipsfiles type zfs (rw,relatime,xattr,noacl,casesensitive)
zroot/data/home/phipsfiles/developing on /home/phip/phipsfiles/developing type zfs (rw,relatime,xattr,noacl,casesensitive)
zroot/data/home/phipsfiles/documents on /home/phip/phipsfiles/documents type zfs (rw,relatime,xattr,noacl,casesensitive)
zroot/data/home/phipsfiles/swxxx on /home/phip/phipsfiles/swxxx type zfs (rw,relatime,xattr,noacl,casesensitive)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)

What are all your local file system partition types

All local data is stored on ZFS

How do you use 'onedrive'

The folder is a company share, about 10 other people have access. It is not under heavy use, though. Most likely, there weren't any remote changes between my subsequent synchronization runs. All other people synchronize via the Windows client.

Steps to reproduce the behaviour

See above: Just running --sync or --monitor with the offending configuration reliably triggers the exception.

Complete Verbose Log Output

$ onedrive --monitor --confdir=/home/phip/.config/onedrive/swxxxin --verbose
D-Bus message bus daemon is available; GUI notifications are now enabled
Reading configuration file: /home/phip/.config/onedrive/swxxxin/config
Configuration file successfully loaded
Using 'user' configuration path for application config and state data: /home/phip/.config/onedrive/swxxxin
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
Application Version:  onedrive v2.5.0-1+np1+1.1
Account Type:         documentLibrary
Default Drive ID:     b!2iD8j..........5nb-SaXtsp
Default Root ID:      01NRA.......54PWSELRRZ
Remaining Free Space: 25581.42 GB (27467845360701 bytes)
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/phip/phipsfiles/swxxx/swxxxin
OneDrive synchronisation interval (seconds): 300
Maximum allowed open files:                  9223372036854775807
Maximum allowed inotify user watches:        239478
Initialising filesystem inotify monitoring ...
Monitoring directory: .
Monitoring directory: ./Markenschutz
Monitoring directory: ./Markenschutz/_history
Monitoring directory: ./Personelles

# ~3000 more such entries...

Monitoring directory: ./Logins
Monitoring directory: ./Logins/history
Performing initial synchronisation to ensure consistent local state ...
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Starting a sync with Microsoft OneDrive
Fetching /delta response from the OneDrive API for Drive ID: b!2iD8j............b5nb-SaXtsp
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 4
The local item is out-of-sync with OneDrive, renaming to preserve existing file and prevent local data loss: Artikel/CU/100-576A/_history/20210521_100-576A_servodraht.pdf -> Artikel/CU/100-576A/_history/20210521_100-576A_servodraht-phiptp-2.pdf
Monitored directory removed: Artikel/CU/100-287A-SN/
Monitored directory removed: CorporateIdentity/_history/Color Palettes/

# ~3000 more such entries...

Monitored directory removed: Artikel/CU/100-074A/
Monitored directory removed: Artikel/ST/200-221/
Attempting to perform a database vacuum to optimise database
Database vacuum is complete
core.time.TimeException@/usr/lib/ldc/x86_64-linux-gnu/include/d/std/datetime/systime.d(9101): Invalid ISO Extended String:  ä�·¢�
----------------
??:? [0x555c762457b5]
??:? [0x555c76242441]
??:? [0x555c76242bb8]
??:? [0x555c7625d148]
??:? [0x555c762577bd]
??:? [0x555c7624fcd1]
??:? [0x555c7624e19b]
??:? [0x555c762e005a]
??:? [0x555c762dd78e]
??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x7f622879db7b]
??:? _d_run_main2 [0x7f622879d995]
??:? _d_run_main [0x7f622879d7ed]
??:? [0x7f622841c249]
??:? __libc_start_main [0x7f622841c304]
??:? [0x555c76223220]

Screenshots

No response

Other Log Information or Details

No response

Additional context

No response

@phlibi phlibi added the Bug Something isn't working label Sep 17, 2024
@abraunegg
Copy link
Owner

abraunegg commented Sep 17, 2024

@phlibi
This is the same as #2810 - but that was closed by the user as non reproducible.

This issue was never seen in >12 months of testing, it could be unique to your environment. Please do the following:

  1. Stop any client action
  2. PRESERVE your items.sqlite3 file - make a backup of it as it exists, and email it to me please
  3. Please build the client from 'source' and create a proper debug log please to capture / decode where the issue is as the package you are using does not have the --enable-debug flag - thus you need to compile from source to get the right details.

When building from source, please enable debugging:

./configure --enable-debug make clean; make;

To create a debug log output, please follow:

onedrive --sync --verbose --verbose > debug_output.log 2>&1

The above steps are required to decode the following:

??:? [0x555c762457b5]
??:? [0x555c76242441]
??:? [0x555c76242bb8]
??:? [0x555c7625d148]
??:? [0x555c762577bd]
??:? [0x555c7624fcd1]
??:? [0x555c7624e19b]
??:? [0x555c762e005a]
??:? [0x555c762dd78e]
??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x7f622879db7b]
??:? _d_run_main2 [0x7f622879d995]
??:? _d_run_main [0x7f622879d7ed]
??:? [0x7f622841c249]
??:? __libc_start_main [0x7f622841c304]
??:? [0x555c76223220]

Without rebuilding your client, this cannot be decoded correctly, and, more importantly it is impossible to work out what/where this is coming from.

Thanks.

@abraunegg abraunegg added Duplicate This issue or pull request already exists More Information Needed Waiting on Response labels Sep 17, 2024
@abraunegg abraunegg added this to the v2.5.1 milestone Sep 17, 2024
abraunegg added a commit that referenced this issue Sep 18, 2024
* Add isValidUTCDateTime function to validate timestamp as received from OneDrive API to ensure it is valid
* Use new function before attempting to call SysTime.fromISOExtString to ensure this call will be successful
@abraunegg
Copy link
Owner

@phlibi
I have been looking at this issue today more deeply, and I am still trying to deduce where the issue is being introduced, however, given that there are 2 issues of the same type that has been raised, I have worked on a more generic fix.

Please can you compile the following PR:

git clone https://github.com/abraunegg/onedrive.git
cd onedrive
git fetch origin pull/2816/head:pr2816
git checkout pr2816

To build the PR, please follow the steps to prepare your system to build the client as detailed here: https://github.com/abraunegg/onedrive/blob/master/docs/install.md#building-from-source---high-level-requirements

When building the client, please use the following:

./configure --enable-debug make clean; make;

When running the PR, your version should be: onedrive v2.5.0-3-gf06e816 or greater.

@phlibi
Copy link
Author

phlibi commented Sep 18, 2024

@abraunegg Wow, thanks for the quick response!

Here's what happens when I run the same with the PR:

$ ./onedrive --version
onedrive v2.5.0-5-gd8807f1

$ ./onedrive --sync --confdir=/home/phip/.config/onedrive/swixxxin
Reading configuration file: /home/phip/.config/onedrive/swixxxin/config
Configuration file successfully loaded
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Configuring Global Azure AD Endpoints
Fetching items from the OneDrive API for Drive ID: b!2iD8jx..............YTb5nb-SaXtsp ..Attempting to perform a database vacuum to optimise database
Database vacuum is complete
core.time.TimeException@/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/datetime/systime.d(9112): Invalid ISO Extended String:  䁷�
----------------
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/exception.d:460 pure @safe bool std.exception.enforce!(bool).enforce(bool, lazy object.Throwable) [0x558aaceb8684]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/datetime/systime.d:9112 @safe std.datetime.systime.SysTime std.datetime.systime.SysTime.fromISOExtString!(const(char)[]).fromISOExtString(scope const(char[]), immutable(std.datetime.timezone.TimeZone)) [0x558aacf6db46]
src/itemdb.d:725 itemdb.Item itemdb.ItemDatabase.buildItem(sqlite.Statement.Result) [0x558aad01fb86]
src/itemdb.d:522 bool itemdb.ItemDatabase.selectById(const(char)[], const(char)[], out itemdb.Item) [0x558aad01dfa6]
src/sync.d:1337 void syncEngine.SyncEngine.processRootAndDeletedJSONItems(std.json.JSONValue, immutable(char)[], bool, bool, bool) [0x558aacfe97b2]
src/sync.d:1273 void syncEngine.SyncEngine.processDeltaJSONItem(std.json.JSONValue, ulong, int, ulong, bool) [0x558aacfe8f07]
src/sync.d:1013 void syncEngine.SyncEngine.fetchOneDriveDeltaAPIResponse(immutable(char)[], immutable(char)[], immutable(char)[]) [0x558aacfe72fd]
src/sync.d:566 void syncEngine.SyncEngine.syncOneDriveAccountToLocalDisk() [0x558aacfe5265]
src/main.d:1221 void main.performStandardSyncProcess(immutable(char)[], monitor.Monitor) [0x558aace9f385]
src/main.d:777 _Dmain [0x558aace9c46d]

Using --verbose (once or twice) seems to not add valuable information, at least not directly before the crash. Do you still need the items.sqlite3 and verbose log by e-mail?

@abraunegg
Copy link
Owner

@phlibi
A debug log via email would be greatly appreciated to further nail this.

@abraunegg
Copy link
Owner

@phlibi
Another 'out of curiosity' item would be the items.sqlite3 to understand what the DB entry is that it is failing on.

@phlibi
Copy link
Author

phlibi commented Sep 18, 2024

@abraunegg The archive with the files is ready, but where should I send it to? I can't find your e-mail address and the support@mynas.com.au mentioned on the issue creation page seems to be a bit unspecific.

@abraunegg
Copy link
Owner

@abraunegg The archive with the files is ready, but where should I send it to? I can't find your e-mail address and the support@mynas.com.au mentioned on the issue creation page seems to be a bit unspecific.

That is the correct email address

@phlibi
Copy link
Author

phlibi commented Sep 18, 2024

@abraunegg Did you receive my e-mail?

@abraunegg
Copy link
Owner

No email or anything in spam

@abraunegg
Copy link
Owner

@phlibi
The emails have finally come through - some sort of 0365 delay

@abraunegg
Copy link
Owner

@phlibi
I have responded via email

@phlibi
Copy link
Author

phlibi commented Sep 18, 2024

@abraunegg Thanks, I'll continue here to have the track record in the issue.

First try without a resync asserts as expected:

$ ./onedrive --version
onedrive v2.5.0-6-g280d369

$ ./onedrive --sync --confdir=/home/phip/.config/onedrive/swxxxin --verbose
Reading configuration file: /home/phip/.config/onedrive/swxxxin/config
Configuration file successfully loaded
Using 'user' configuration path for application config and state data: /home/phip/.config/onedrive/swxxxin
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
Application Version:  onedrive v2.5.0-6-g280d369
Account Type:         documentLibrary
Default Drive ID:     b!2iD8jxd7F.........JyTuYTb5nb-SaXtsp
Default Root ID:      01NRARY......4PWSELRRZ
Remaining Free Space: 25581.42 GB (27467836877735 bytes)
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/phip/phipsfiles/swxxx/swxxxin
Fetching /delta response from the OneDrive API for Drive ID: b!2iD8jxd7Fk...........b5nb-SaXtsp
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 21
The local item is out-of-sync with OneDrive, renaming to preserve existing file and prevent local data loss: Artikel/CU/100-576A/_history/20210521_100-576A_servodraht.pdf -> Artikel/CU/100-576A/_history/20210521_100-576A_servodraht-phiptp-2.pdf
Attempting to perform a database vacuum to optimise database
Database vacuum is complete
core.exception.AssertError@src/itemdb.d(701): The DB record mtime entry is not a valid ISO timestamp entry
----------------
??:? _d_assert_msg [0x55f121b9fc30]
src/itemdb.d:701 itemdb.Item itemdb.ItemDatabase.buildItem(sqlite.Statement.Result) [0x55f121b6895b]
src/itemdb.d:526 bool itemdb.ItemDatabase.selectById(const(char)[], const(char)[], out itemdb.Item) [0x55f121b6706a]
src/sync.d:1337 void syncEngine.SyncEngine.processRootAndDeletedJSONItems(std.json.JSONValue, immutable(char)[], bool, bool, bool) [0x55f121b327b2]
src/sync.d:1273 void syncEngine.SyncEngine.processDeltaJSONItem(std.json.JSONValue, ulong, int, ulong, bool) [0x55f121b31f07]
src/sync.d:1013 void syncEngine.SyncEngine.fetchOneDriveDeltaAPIResponse(immutable(char)[], immutable(char)[], immutable(char)[]) [0x55f121b302fd]
src/sync.d:566 void syncEngine.SyncEngine.syncOneDriveAccountToLocalDisk() [0x55f121b2e265]
src/main.d:1221 void main.performStandardSyncProcess(immutable(char)[], monitor.Monitor) [0x55f1219e8385]
src/main.d:777 _Dmain [0x55f1219e546d]

Then with the resync:

$ ./onedrive --sync --confdir=/home/phip/.config/onedrive/swxxxin --verbose --resync --resync-auth
Reading configuration file: /home/phip/.config/onedrive/swxxxin/config
Configuration file successfully loaded
Using 'user' configuration path for application config and state data: /home/phip/.config/onedrive/swxxxin
Deleting the saved application sync status ...
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
Application Version:  onedrive v2.5.0-6-g280d369
Account Type:         documentLibrary
Default Drive ID:     b!2iD8jxd7FkW1xAV..........YTb5nb-SaXtsp
Default Root ID:      01NRARY......54PWSELRRZ
Remaining Free Space: 25581.42 GB (27467836877735 bytes)
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/phip/phipsfiles/swxxx/swxxxin
Fetching /delta response from the OneDrive API for Drive ID: b!2iD8jxd..........JyTuYTb5nb-SaXtsp
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 201
Processing API Response Bundle: 2 - Quantity of 'changes|items' in this bundle to process: 203
Processing API Response Bundle: 3 - Quantity of 'changes|items' in this bundle to process: 203
...
Processing API Response Bundle: 88 - Quantity of 'changes|items' in this bundle to process: 341
Processing API Response Bundle: 89 - Quantity of 'changes|items' in this bundle to process: 323
Processing API Response Bundle: 90 - Quantity of 'changes|items' in this bundle to process: 253
Finished processing /delta JSON response from the OneDrive API
Processing 23235 applicable changes and items received from Microsoft OneDrive
Processing OneDrive JSON item batch [1/47] to ensure consistent local state
Processing OneDrive JSON item batch [2/47] to ensure consistent local state
Processing OneDrive JSON item batch [3/47] to ensure consistent local state
Processing OneDrive JSON item batch [4/47] to ensure consistent local state
Processing OneDrive JSON item batch [5/47] to ensure consistent local state
Processing OneDrive JSON item batch [6/47] to ensure consistent local state
Processing OneDrive JSON item batch [7/47] to ensure consistent local state
Processing OneDrive JSON item batch [8/47] to ensure consistent local state
Local file time discrepancy detected: ./20160906_bfs_uid.pdf
This local file has a different modified time 2024-Sep-17 11:14:48Z (UTC) when compared to remote modified time 2016-Sep-26 13:14:18Z (UTC)
Local item has the same hash value as the item online - correcting the applicable file timestamp
The source of the incorrect timestamp was OneDrive online - correcting timestamp online
Local file modified time matches existing database record - keeping local file
Local file time discrepancy detected: Artikel/_copy_pdf.py
This local file has a different modified time 2024-Sep-17 11:14:49Z (UTC) when compared to remote modified time 2023-Apr-21 12:41:03Z (UTC)
Local item has the same hash value as the item online - correcting the applicable file timestamp
The source of the incorrect timestamp was OneDrive online - correcting timestamp online
Local file modified time matches existing database record - keeping local file
Local file time discrepancy detected: Artikel/seriennummern.ods
...
This local file has a different modified time 2024-Sep-17 11:31:53Z (UTC) when compared to remote modified time 2017-Aug-23 08:27:50Z (UTC)
Local item has the same hash value as the item online - correcting the applicable file timestamp
The source of the incorrect timestamp was OneDrive online - correcting timestamp online
^C
Received termination signal, attempting to cleanly shutdown application
Waiting for any existing upload|download process to complete
Attempting to perform a database vacuum to optimise database
Database vacuum is complete

It was running like this for some time, apparently spitting out the same "correcting timestamp online" message for each present file. I'm a bit worried about it messing with the data on the server, which is why I've stopped the process then. I also don't know where those mismatches could come from, as I didn't modify the local files and nobody would have changed all of them online. Just to be sure, I checked my local time and this is correct and synchronized via NTP, so should not be an issue. To me, it looks like the online files have their timestamp set to whenever they were actually created (which is fine), while the local file timestamps are from yesterday when I first synchronized the data. So if it now overwrites all those original timestamps with yesterday, that is a bit, well, uncool.

To avoid messing with the online data, I then added --download-only and now it was running through:

$ ./onedrive --sync --confdir=/home/phip/.config/onedrive/swxxxin --verbose --resync --resync-auth --download-only
Reading configuration file: /home/phip/.config/onedrive/swxxxin/config
Configuration file successfully loaded
Using 'user' configuration path for application config and state data: /home/phip/.config/onedrive/swxxxin
Deleting the saved application sync status ...
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
Application Version:  onedrive v2.5.0-6-g280d369
Account Type:         documentLibrary
Default Drive ID:     b!2iD8jxd...............Tb5nb-SaXtsp
Default Root ID:      01NRARYLF6Y2GOVW7725BZO354PWSELRRZ
Remaining Free Space: 25578.44 GB (27464645763683 bytes)
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/phip/phipsfiles/swxxx/swxxxin
Fetching /delta response from the OneDrive API for Drive ID: b!2iD8jxd...............Tb5nb-SaXtsp
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 201
Processing API Response Bundle: 2 - Quantity of 'changes|items' in this bundle to process: 203
...
Processing API Response Bundle: 88 - Quantity of 'changes|items' in this bundle to process: 341
Processing API Response Bundle: 89 - Quantity of 'changes|items' in this bundle to process: 323
Processing API Response Bundle: 90 - Quantity of 'changes|items' in this bundle to process: 253
Finished processing /delta JSON response from the OneDrive API
Processing 23235 applicable changes and items received from Microsoft OneDrive
Processing OneDrive JSON item batch [1/47] to ensure consistent local state
Processing OneDrive JSON item batch [2/47] to ensure consistent local state
Processing OneDrive JSON item batch [3/47] to ensure consistent local state
Processing OneDrive JSON item batch [4/47] to ensure consistent local state
Processing OneDrive JSON item batch [5/47] to ensure consistent local state
Processing OneDrive JSON item batch [6/47] to ensure consistent local state
Processing OneDrive JSON item batch [7/47] to ensure consistent local state
Processing OneDrive JSON item batch [8/47] to ensure consistent local state
Processing OneDrive JSON item batch [9/47] to ensure consistent local state
Processing OneDrive JSON item batch [10/47] to ensure consistent local state
Processing OneDrive JSON item batch [11/47] to ensure consistent local state
Processing OneDrive JSON item batch [12/47] to ensure consistent local state
Processing OneDrive JSON item batch [13/47] to ensure consistent local state
Processing OneDrive JSON item batch [14/47] to ensure consistent local state
Processing OneDrive JSON item batch [15/47] to ensure consistent local state
Processing OneDrive JSON item batch [16/47] to ensure consistent local state
Processing OneDrive JSON item batch [17/47] to ensure consistent local state
Processing OneDrive JSON item batch [18/47] to ensure consistent local state
Processing OneDrive JSON item batch [19/47] to ensure consistent local state
Local file time discrepancy detected: Literatur/D_blatt/AirFlowSensor/honeywell-sensing-zephyr-digital-airflow-sensors-haf-series-high-accuracy-10slpm-to-300slpm-product-sheet-008268-4-en(1).pdf
This local file has a different modified time 2024-Sep-17 11:31:53Z (UTC) when compared to remote modified time 2017-Aug-23 08:27:50Z (UTC)
Local item has the same hash value as the item online - correcting the applicable file timestamp
The source of the incorrect timestamp was the local file - correcting timestamp locally due to --download-only
Local file time discrepancy detected: Literatur/D_blatt/AirFlowSensor/honeywell-sensing-pressure-force-flow-rangeguide-008081-18-EN.pdf
This local file has a different modified time 2024-Sep-17 11:31:53Z (UTC) when compared to remote modified time 2017-Aug-23 08:27:50Z (UTC)
Local item has the same hash value as the item online - correcting the applicable file timestamp
The source of the incorrect timestamp was the local file - correcting timestamp locally due to --download-only
...
Performing a database consistency and integrity check on locally stored data
Processing DB entries for this Drive ID: b!2iD8jxd...............Tb5nb-SaXtsp
Processing: ~/phipsfiles/swxxx/swxxxin
The directory has not changed
Processing: General
The directory has not changed
...
Sync with Microsoft OneDrive is complete
Attempting to perform a database vacuum to optimise database
Database vacuum is complete

Strange thing: It says that it modified the local modification time of the files, but when I check on my disk, almost all of them have still an mtime of the first synchronization yesterday.

Now another run without special flags:

$ ./onedrive --sync --confdir=/home/phip/.config/onedrive/swxxxin --verbose
Reading configuration file: /home/phip/.config/onedrive/swxxxin/config
Configuration file successfully loaded
Using 'user' configuration path for application config and state data: /home/phip/.config/onedrive/swxxxin
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
Application Version:  onedrive v2.5.0-6-g280d369
Account Type:         documentLibrary
Default Drive ID:     b!2iD8jxd...............Tb5nb-SaXtsp
Default Root ID:      01NRARYLF6Y2GOVW7725BZO354PWSELRRZ
Remaining Free Space: 25578.44 GB (27464645691750 bytes)
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/phip/phipsfiles/swxxx/swxxxin
Fetching /delta response from the OneDrive API for Drive ID: b!2iD8jxd...............Tb5nb-SaXtsp
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 0
Finished processing /delta JSON response from the OneDrive API
No changes or items that can be applied were discovered while processing the data received from Microsoft OneDrive
Performing a database consistency and integrity check on locally stored data
Processing DB entries for this Drive ID: b!2iD8jxd...............Tb5nb-SaXtsp
Processing: ~/phipsfiles/swxxx/swxxxin
The directory has not changed
Processing: General
The directory has not changed
Processing: AGB
The directory has not changed
...
The file has not changed
Scanning the local file system '~/phipsfiles/swxxx/swxxxin' for new data to upload
Performing a last examination of the most recent online data within Microsoft OneDrive to complete the reconciliation process
Fetching /delta response from the OneDrive API for Drive ID: b!2iD8jxd...............Tb5nb-SaXtsp
Processing API Response Bundle: 1 - Quantity of 'changes|items' in this bundle to process: 0
Finished processing /delta JSON response from the OneDrive API
No changes or items that can be applied were discovered while processing the data received from Microsoft OneDrive

Sync with Microsoft OneDrive is complete
Attempting to perform a database vacuum to optimise database
Database vacuum is complete

So it looks as if it is now working properly. Whether this is simply due to the resync, the patch or something else, I don't know.

@phlibi
Copy link
Author

phlibi commented Sep 18, 2024

Funny. With one of the folders now fixed, I went ahead to synchronize another one, this time directly starting with --resync --resync-auth --verbose --download-only. Again, it appears to go through until the end and then fails, this time with another, but probably related exception:

Processing: heliumv/bestellungen/2020
The directory has not changed
Attempting to perform a database vacuum to optimise database
Database vacuum is complete
std.utf.UTFException@/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d(1556): Invalid UTF-8 sequence (at index 1)
----------------
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1594 pure dchar std.utf.decodeImpl!(true, 0, const(char)[]).decodeImpl(ref const(char)[], ref ulong) [0x5627578b3090]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1186 pure @trusted dchar std.utf.decode!(0, const(char)[]).decode(scope ref const(char)[], ref ulong) [0x5627578b3003]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:827 pure @safe bool std.regex.internal.ir.Input!(char).Input.nextChar(ref dchar, ref ulong) [0x562757888cb2]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:789 pure @trusted bool std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.next() [0x56275788ccc4]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:943 pure @trusted int std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.match(std.regex.internal.ir.Group!(ulong).Group[]) [0x5627578910d1]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 pure void std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).__lambda4!(std.regex.internal.ir.Group!(ulong).Group[]).__lambda4(std.regex.internal.ir.Group!(ulong).Group[]) [0x5627578a7050]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:1122 pure void std.regex.internal.ir.SmallFixedArray!(std.regex.internal.ir.Group!(ulong).Group, 3u).SmallFixedArray.mutate(scope void delegate(std.regex.internal.ir.Group!(ulong).Group[]) pure) [0x56275789819a]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 ref @trusted std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x5627578a6fb6]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:1013 @safe std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.match!(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).match(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x5627578a6e1d]
src/util.d:521 bool util.isValidUTCDateTime(immutable(char)[]) [0x5627578ad9ee]
src/itemdb.d:701 itemdb.Item itemdb.ItemDatabase.buildItem(sqlite.Statement.Result) [0x56275790b920]
src/itemdb.d:505 itemdb.Item[] itemdb.ItemDatabase.selectChildren(const(char)[], const(char)[]) [0x562757909db7]
src/sync.d:3371 void syncEngine.SyncEngine.checkDirectoryDatabaseItemForConsistency(itemdb.Item, immutable(char)[]) [0x5627578e1f4b]
src/sync.d:3217 void syncEngine.SyncEngine.checkDatabaseItemForConsistency(itemdb.Item) [0x5627578e0eb1]
src/sync.d:3373 void syncEngine.SyncEngine.checkDirectoryDatabaseItemForConsistency(itemdb.Item, immutable(char)[]) [0x5627578e1fce]
src/sync.d:3217 void syncEngine.SyncEngine.checkDatabaseItemForConsistency(itemdb.Item) [0x5627578e0eb1]
src/sync.d:3373 void syncEngine.SyncEngine.checkDirectoryDatabaseItemForConsistency(itemdb.Item, immutable(char)[]) [0x5627578e1fce]
src/sync.d:3217 void syncEngine.SyncEngine.checkDatabaseItemForConsistency(itemdb.Item) [0x5627578e0eb1]
src/sync.d:3373 void syncEngine.SyncEngine.checkDirectoryDatabaseItemForConsistency(itemdb.Item, immutable(char)[]) [0x5627578e1fce]
src/sync.d:3217 void syncEngine.SyncEngine.checkDatabaseItemForConsistency(itemdb.Item) [0x5627578e0eb1]
src/sync.d:3132 void syncEngine.SyncEngine.performDatabaseConsistencyAndIntegrityCheck() [0x5627578e09c2]
src/main.d:763 _Dmain [0x562757788393]

I then restarted the exact same process and it completed without the exception within a few seconds. So it appears that a run with --resync --resync-auth and maybe --download-only is a valid fix/workaround for this type of failure, at least when they happen during the initial sync. It is also interesting to note that this time the file timestamps appear to have properly been updated from the server, I have now files with an mtime dating back several years in this folder.

@abraunegg
Copy link
Owner

...

This local file has a different modified time 2024-Sep-17 11:14:48Z (UTC) when compared to remote modified time 2016-Sep-26 13:14:18Z (UTC)
Local item has the same hash value as the item online - correcting the applicable file timestamp
The source of the incorrect timestamp was OneDrive online - correcting timestamp online
Local file modified time matches existing database record - keeping local file

I'm a bit worried about it messing with the data on the server, which is why I've stopped the process then. I also don't know where those mismatches could come from, as I didn't modify the local files and nobody would have changed all of them online. Just to be sure, I checked my local time and this is correct and synchronized via NTP, so should not be an issue. To me, it looks like the online files have their timestamp set to whenever they were actually created (which is fine), while the local file timestamps are from yesterday when I first synchronized the data. So if it now overwrites all those original timestamps with yesterday, that is a bit, well, uncool.

Yes .. this is NOT right .... but also shows an issue .. in a --resync scenario, the remote timestamp should be used (NEW BUG)

Local file time discrepancy detected: Literatur/D_blatt/AirFlowSensor/honeywell-sensing-zephyr-digital-airflow-sensors-haf-series-high-accuracy-10slpm-to-300slpm-product-sheet-008268-4-en(1).pdf
This local file has a different modified time 2024-Sep-17 11:31:53Z (UTC) when compared to remote modified time 2017-Aug-23 08:27:50Z (UTC)
Local item has the same hash value as the item online - correcting the applicable file timestamp
The source of the incorrect timestamp was the local file - correcting timestamp locally due to --download-only

This is what should also be happening in a --resync scenario

Strange thing: It says that it modified the local modification time of the files, but when I check on my disk, almost all of them have still an mtime of the first synchronization yesterday.

So the sync yesterday would have been the 'first' time, so, also why the timestamps locally are potentially not what they should be.

So it looks as if it is now working properly. Whether this is simply due to the resync, the patch or something else, I don't know.

A combination of the patch and --resync .. but the added logging in the application for when timestamps are 'missing' has not been triggered, neither has the timestamp validation checks .. so the only thing that I can think of here is somehow some internal database structure | memory corruption on use ....

It is also interesting to note that this time the file timestamps appear to have properly been updated from the server, I have now files with an mtime dating back several years in this folder.

That is good news!

What I will do is for this PR:

  1. Fix up the timestamp logic in this PR when using --resync | file not in database = use online timestamp .. as that is a bug
  2. Back out some of the database optimisation changes to align with how v2.4.25 was configured

Regarding this item:

std.utf.UTFException@/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d(1556): Invalid UTF-8 sequence (at index 1)

Please raise a new bug so it is tracked separately.

Unfortunately I am not going to be able to look at anything today - so looking at these issues any further will be another ~24 hrs or so - sorry.

@abraunegg abraunegg removed the Duplicate This issue or pull request already exists label Sep 18, 2024
@abraunegg abraunegg linked a pull request Sep 20, 2024 that will close this issue
@abraunegg
Copy link
Owner

@phlibi
I have updated the #2816 PR with a number of changes this morning.

Please can you rebuild your client using this PR, to validate the fix for this issue.

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

@abraunegg Thanks!

I'm currently on the way and with only a slow internet connection, so I wasn't yet able to repeat the original process. But a fresh sync on single directory containing mostly small text files still leads to an mtime error:

$ ./onedrive --confdir ~/.config/onedrive/pruefprotokolle --sync --verbose --single-directory Pruefprotokolle
Reading configuration file: /home/phip/.config/onedrive/pruefprotokolle/config
Configuration file successfully loaded
Using 'user' configuration path for application config and state data: /home/phip/.config/onedrive/pruefprotokolle
Application configuration file has been updated, checking if --resync needed
WARNING: no backup config file was found, unable to validate if any changes made
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
WARNING: OneDrive quota information is being restricted. Please fix by speaking to your OneDrive / Office 365 Administrator.
Application Version:  onedrive v2.5.0-11-g8ea3a8c
Account Type:         documentLibrary
Default Drive ID:     b!ozVsZqWF.........b5nb-SaXtsp
Default Root ID:      01EEYLR.......PWSELRRZ
Remaining Free Space: Not Available
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/phip/phipsfiles/swxxx/pruefprotokolle
The OneDrive Client was asked to search for this directory online and create it if it's not located: ./Pruefprotokolle
WARNING: The requested path for --single-directory does not exist locally. Creating requested path within /home/phip/phipsfiles/swxxx/pruefprotokolle
Syncing changes from this selected path: Pruefprotokolle
Generating a /delta response from the OneDrive API for Drive ID: b!ozVsZqWFpUSUs33CI2Qz-cKLyN2PNqJGmwq1JbOq-qcBGxmJyTuYTb5nb-SaXtsp
Adding OneDrive root details for processing
Adding OneDrive folder details for processing
Adding 200 OneDrive items for processing from the OneDrive 'root' Folder
Adding 2 OneDrive items for processing from Pruefprotokolle/Neuma
...
Downloading file: Pruefprotokolle/zzzLastSN.txt ... done
Performing a database consistency and integrity check on locally stored data
Processing DB entries for this Drive ID: b!ozVsZ...........uYTb5nb-SaXtsp
WARNING: OneDrive quota information is being restricted. Please fix by speaking to your OneDrive / Office 365 Administrator.
Attempting to perform a database vacuum to optimise database
Database vacuum is complete
core.exception.AssertError@src/itemdb.d(701): The DB record mtime entry is not a valid ISO timestamp entry
----------------
??:? _d_assert_msg [0x5624f9396f80]
src/itemdb.d:701 itemdb.Item itemdb.ItemDatabase.buildItem(sqlite.Statement.Result) [0x5624f935fcab]
src/itemdb.d:505 itemdb.Item[] itemdb.ItemDatabase.selectChildren(const(char)[], const(char)[]) [0x5624f935e107]
src/sync.d:6346 itemdb.Item[] syncEngine.SyncEngine.getChildren(immutable(char)[], immutable(char)[]) [0x5624f9346380]
src/sync.d:6350 itemdb.Item[] syncEngine.SyncEngine.getChildren(immutable(char)[], immutable(char)[]) [0x5624f934643f]
src/sync.d:6350 itemdb.Item[] syncEngine.SyncEngine.getChildren(immutable(char)[], immutable(char)[]) [0x5624f934643f]
src/sync.d:6350 itemdb.Item[] syncEngine.SyncEngine.getChildren(immutable(char)[], immutable(char)[]) [0x5624f934643f]
src/sync.d:3115 void syncEngine.SyncEngine.performDatabaseConsistencyAndIntegrityCheck() [0x5624f9334842]
src/main.d:1232 void main.performStandardSyncProcess(immutable(char)[], monitor.Monitor) [0x5624f91df422]
src/main.d:781 _Dmain [0x5624f91dc4b1]

$ ./onedrive --version
onedrive v2.5.0-11-g8ea3a8c

Again, the files seem to have been downloaded just fine and the stored mtimes also seem to be correct, which is certainly a good thing.

Running the same process again with --resync --resync-auth does not change anything:

$ ./onedrive --confdir ~/.config/onedrive/pruefprotokolle --sync --verbose --single-directory Pruefprotokolle --resync --resync-auth
Reading configuration file: /home/phip/.config/onedrive/pruefprotokolle/config
Configuration file successfully loaded
Using 'user' configuration path for application config and state data: /home/phip/.config/onedrive/pruefprotokolle
Deleting the saved application sync status ...
Using IPv4 and IPv6 (if configured) for all network operations
Attempting to contact Microsoft OneDrive Login Service
Successfully reached Microsoft OneDrive Login Service
Checking Application Version ...
Attempting to initialise the OneDrive API ...
Configuring Global Azure AD Endpoints
The OneDrive API was initialised successfully
Opening the item database ...
WARNING: OneDrive quota information is being restricted. Please fix by speaking to your OneDrive / Office 365 Administrator.
Application Version:  onedrive v2.5.0-11-g8ea3a8c
Account Type:         documentLibrary
Default Drive ID:     b!ozVsZqW..........yTuYTb5nb-SaXtsp
Default Root ID:      01EEYLRAV.........354PWSELRRZ
Remaining Free Space: Not Available
Sync Engine Initialised with new Onedrive API instance
All application operations will be performed in the configured local 'sync_dir' directory: /home/phip/phipsfiles/swxxx/pruefprotokolle
The OneDrive Client was asked to search for this directory online and create it if it's not located: ./Pruefprotokolle
Syncing changes from this selected path: Pruefprotokolle
Generating a /delta response from the OneDrive API for Drive ID: b!ozVsZq...............YTb5nb-SaXtsp
Adding OneDrive root details for processing
Adding OneDrive folder details for processing
Adding 200 OneDrive items for processing from the OneDrive 'root' Folder
Adding 2 OneDrive items for processing from Pruefprotokolle/Neuma
...
Adding 200 OneDrive items for processing from the OneDrive 'root' Folder
Adding 61 OneDrive items for processing from the OneDrive 'root' Folder
Finished processing self generated /delta JSON response from the OneDrive API
Processing 2964 applicable changes and items received from Microsoft OneDrive
Processing OneDrive JSON item batch [1/6] to ensure consistent local state
Processing OneDrive JSON item batch [2/6] to ensure consistent local state
Processing OneDrive JSON item batch [3/6] to ensure consistent local state
Processing OneDrive JSON item batch [4/6] to ensure consistent local state
Processing OneDrive JSON item batch [5/6] to ensure consistent local state
Processing OneDrive JSON item batch [6/6] to ensure consistent local state
Performing a database consistency and integrity check on locally stored data
Processing DB entries for this Drive ID: b!ozVsZqWFpUSUs33.................YTb5nb-SaXtsp
WARNING: OneDrive quota information is being restricted. Please fix by speaking to your OneDrive / Office 365 Administrator.
Processing: Pruefprotokolle/Neuma
The directory has not changed
...
Scanning the local file system '~/phipsfiles/swxxx/pruefprotokolle/Pruefprotokolle' for new data to upload
Performing a last examination of the most recent online data within Microsoft OneDrive to complete the reconciliation process
Generating a /delta response from the OneDrive API for Drive ID: b!ozVsZqWFpUS.............TuYTb5nb-SaXtsp
Adding OneDrive root details for processing
Adding OneDrive folder details for processing
Adding 200 OneDrive items for processing from the OneDrive 'root' Folder
Adding 2 OneDrive items for processing from Pruefprotokolle/Neuma
...
Processing OneDrive JSON item batch [5/6] to ensure consistent local state
Processing OneDrive JSON item batch [6/6] to ensure consistent local state

Sync with Microsoft OneDrive is complete
Attempting to perform a database vacuum to optimise database
Database vacuum is complete

Out of curiosity: What is this new assertion, or what is asserted against? Is this a data string not matching an expected format (YYYY-MM-DDTHH:MM:SSZ or something), or is it a timestamp value that is out of bounds, or what is it exactly that triggers the exception? Could this have anything to do with my local timezone or locale settings?

$ date
Fr 20 Sep 2024 07:56:23 CEST
$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=de_CH.UTF-8
LC_TIME=de_CH.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=de_CH.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT=de_CH.UTF-8
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

Please note that I'm using a "mixed" locale, with US English for the language, but Swiss German for some other formats, including date/time and numbers.

@abraunegg
Copy link
Owner

@phlibi

./onedrive --confdir ~/.config/onedrive/pruefprotokolle --sync --verbose --single-directory ....

You need to add --resync as well

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

@abraunegg
I know, that's what I did in the second run, which was successful (as before). The first run failed on an initial sync, where I assume that --resync does not make a difference, since there is nothing available before anyway.

@abraunegg
Copy link
Owner

@phlibi

Out of curiosity: What is this new assertion, or what is asserted against? Is this a data string not matching an expected format (YYYY-MM-DDTHH:MM:SSZ or something), or is it a timestamp value that is out of bounds, or what is it exactly that triggers the exception? Could this have anything to do with my local timezone or locale settings?

That assertion is triggered when the data, in the database for 'mtime' (which is stored as a string):

  • string is not a valid UTF-8 string
  • string is not a valid UTC Time Format of YYYY-MM-DDTHH:MM:SSZ
  • string cannot be converted to a valid SysTime struct using the fromISOExtString function

Essentially - the data, in the DB record what was stored is 'bad' for some reason ..

@abraunegg
Copy link
Owner

abraunegg commented Sep 20, 2024

@phlibi

Is this a data string not matching an expected format

To narrow this down, I have updated the PR to add the following:

  • BAD TIMESTAMP (UTF-8 FAIL):
  • BAD TIMESTAMP (REGEX FAIL):
  • BAD TIMESTAMP (CONVERSION FAIL):

If you rebuild the PR, and use onedrive v2.5.0-13-ga435e67 or greater and you will get the reason for the assertion

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

@abraunegg It's really strange. Since the --resync again "fixed" the issue, I had to remove all the data and download it from scratch and this time it failed again, but now with the UTFException (#2829):

std.utf.UTFException@/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d(1556): Invalid UTF-8 sequence (at index 1)
----------------
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1646 pure dchar std.utf.decodeImpl!(true, 0, const(char)[]).decodeImpl(ref const(char)[], ref ulong) [0x557e6203933c]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1186 pure @trusted dchar std.utf.decode!(0, const(char)[]).decode(scope ref const(char)[], ref ulong) [0x557e62039257]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:827 pure @safe bool std.regex.internal.ir.Input!(char).Input.nextChar(ref dchar, ref ulong) [0x557e6200ed0e]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:789 pure @trusted bool std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.next() [0x557e62012d20]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:943 pure @trusted int std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.match(std.regex.internal.ir.Group!(ulong).Group[]) [0x557e6201712d]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 pure void std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).__lambda4!(std.regex.internal.ir.Group!(ulong).Group[]).__lambda4(std.regex.internal.ir.Group!(ulong).Group[]) [0x557e6202d0ac]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:1122 pure void std.regex.internal.ir.SmallFixedArray!(std.regex.internal.ir.Group!(ulong).Group, 3u).SmallFixedArray.mutate(scope void delegate(std.regex.internal.ir.Group!(ulong).Group[]) pure) [0x557e6201e1f6]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 ref @trusted std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x557e6202d012]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:1013 @safe std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.match!(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).match(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x557e6202ce79]
src/util.d:540 bool util.isValidUTCDateTime(immutable(char)[]) [0x557e62033b5b]
src/itemdb.d:701 itemdb.Item itemdb.ItemDatabase.buildItem(sqlite.Statement.Result) [0x557e62091dcc]
src/itemdb.d:505 itemdb.Item[] itemdb.ItemDatabase.selectChildren(const(char)[], const(char)[]) [0x557e62090263]
src/sync.d:6346 itemdb.Item[] syncEngine.SyncEngine.getChildren(immutable(char)[], immutable(char)[]) [0x557e620784dc]
src/sync.d:6679 std.json.JSONValue syncEngine.SyncEngine.generateDeltaResponse(immutable(char)[]) [0x557e6207a987]
src/sync.d:1098 void syncEngine.SyncEngine.fetchOneDriveDeltaAPIResponse(immutable(char)[], immutable(char)[], immutable(char)[]) [0x557e62059b31]
src/sync.d:566 void syncEngine.SyncEngine.syncOneDriveAccountToLocalDisk() [0x557e620574b9]
src/main.d:1225 void main.performStandardSyncProcess(immutable(char)[], monitor.Monitor) [0x557e61f113c9]
src/main.d:781 _Dmain [0x557e61f0e4b1]

I then hacked a bit of debug output into the code to show the actual string it's getting from the DB in this case and got this:

Exception during isValidUTCDateTime assertion: '�d�]'

Here's my change in itemdb.d:701 (I have no experience with D, just did some trial and error with the help of an AI tool...):

		try {
			assert(isValidUTCDateTime(result.front[7].dup), "The DB record mtime entry is not a valid ISO timestamp entry: '" ~ result.front[7].dup ~ "'");
		} catch (Exception e) {
			
			writeln("Exception during isValidUTCDateTime assertion: '" ~ result.front[7].dup ~ "'");
			throw e; // Re-throw the exception after logging
		}

My assumption is that this just gets some random data, which will then either completely fail UTF-8 decoding as in this example, leading to the UTFException, or be some random bytes that happen to be valid UTF-8, which then obviously don't pass the check for a valid timestamp. I guess that #2813 and #2829 really are due to the same cause, but manifest with a different exception depending on what random data is returned.

I'm now running an initial sync again, to see with what it comes up this time.

When you checked the SQLite file that I sent you, did you find any such bogus mtime entries? I guess they would need to show up somehow there; if not, that would mean that the data gets corrupted during reading. But that wouldn't explain why it works after a --resync, since the data is certainly read multiple times with the same code, but maybe written in different branches, depending whether --resync is used or not?

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

Run is done already, again with a very bad string:

...
Exception during isValidUTCDateTime assertion: '����~'
Attempting to perform a database vacuum to optimise database
Database vacuum is complete
std.utf.UTFException@/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d(1556): Invalid UTF-8 sequence (at index 1)
----------------
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1594 pure dchar std.utf.decodeImpl!(true, 0, const(char)[]).decodeImpl(ref const(char)[], ref ulong) [0x55df47e483d8]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1186 pure @trusted dchar std.utf.decode!(0, const(char)[]).decode(scope ref const(char)[], ref ulong) [0x55df47e4834b]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:827 pure @safe bool std.regex.internal.ir.Input!(char).Input.nextChar(ref dchar, ref ulong) [0x55df47e1de02]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:789 pure @trusted bool std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.next() [0x55df47e21e14]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:943 pure @trusted int std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.match(std.regex.internal.ir.Group!(ulong).Group[]) [0x55df47e26221]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 pure void std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).__lambda4!(std.regex.internal.ir.Group!(ulong).Group[]).__lambda4(std.regex.internal.ir.Group!(ulong).Group[]) [0x55df47e3c1a0]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:1122 pure void std.regex.internal.ir.SmallFixedArray!(std.regex.internal.ir.Group!(ulong).Group, 3u).SmallFixedArray.mutate(scope void delegate(std.regex.internal.ir.Group!(ulong).Group[]) pure) [0x55df47e2d2ea]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 ref @trusted std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x55df47e3c106]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:1013 @safe std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.match!(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).match(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x55df47e3bf6d]
src/util.d:540 bool util.isValidUTCDateTime(immutable(char)[]) [0x55df47e42c4f]
src/itemdb.d:702 itemdb.Item itemdb.ItemDatabase.buildItem(sqlite.Statement.Result) [0x55df47ea0ec0]
src/itemdb.d:505 itemdb.Item[] itemdb.ItemDatabase.selectChildren(const(char)[], const(char)[]) [0x55df47e9f357]
src/sync.d:6346 itemdb.Item[] syncEngine.SyncEngine.getChildren(immutable(char)[], immutable(char)[]) [0x55df47e875d0]
src/sync.d:3115 void syncEngine.SyncEngine.performDatabaseConsistencyAndIntegrityCheck() [0x55df47e75a92]
src/main.d:1232 void main.performStandardSyncProcess(immutable(char)[], monitor.Monitor) [0x55df47d20422]
src/main.d:781 _Dmain [0x55df47d1d4b1]

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

I downloaded sqlitebrowser and examined the items.sqlite3 file from this last unsuccessful run. There are indeed a few rows which have a binary BLOB containing bogus data in the mtime column:
Screenshot_20240920_093859
The hex dump shows :19Z at the end, so it looks like part if it would be valid, but the stuff in front is corrupted.
The other columns seem to be fine for these rows, so it doesn't look like a general corruption of the DB, but really just those individual mtime entries. Huh?

@abraunegg
Copy link
Owner

@phlibi
I am seeing that in the file you provided as well.

So .. the application is converting this correctly internally - what SQLite version are you running?

Now - I use ZFS myself heavily - potentially maybe even some ZFS|database issue going on?

Does your system use ECC memory? What version of ZFS are you using?

Other cases of ZFS | SQLite corruption: https://forums.unraid.net/bug-reports/stable-releases/sqlite-database-corruption-back-r2566/

so the question is .. is this a true bug? or file system bug ... or a memory bug on your platform?

@abraunegg
Copy link
Owner

@phlibi
What is the recordsize value for where this file resides: /home/phip/.config/onedrive/swxxxin/items.sqlite3 ?

SQLite default page size is 16KB ... so recordsize=16K should be used to align to this ?

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

@abraunegg I now reproduced it a few times and it really always affects the mtime entry for a handful of rows. Not always the same amount (one run was indeed completely successful), but always the same and only this column. While a see SQLite as a potential culprit, blaming ZFS or even bad hardware seems to be a bit too easy. I'm running on an ordinary notebook, so no ECC or anything like this.

About the versions and recordsize:

$ apt show libsqlite3-dev | grep Version      # Is this the right package to query?
Version: 3.40.1-2
$ zfs --version
zfs-2.2.6-1~bpo12+1
zfs-kmod-2.2.6-1~bpo12+1
$ zfs get recordsize zroot/data/home
NAME             PROPERTY    VALUE    SOURCE
zroot/data/home  recordsize  128K     default

The recordsize would be far too large, but if I'm not mistaken, this only affects performance.

Just to be sure, I ran the same sync on a completely different (and much older) machine and got the same issue there. This time it was a BAD TIMESTAMP (REGEX FAIL): 2024-Sep-20 10:08:03 (I think the trailing Z is missing). Looking at items.sqlite3, I also see some BLOB mtimes here, so basically exactly the same outcome. This machine is running on ext4 on encrypted LVM and with these versions:

Xubuntu 20.04
Kernel 5.15.0-69-generic
libsqlite3-dev: 3.31.1-4ubuntu0.6

Out of curiosity, I then patched the code a bit, in the hope that it would crash when the broken item is written to the DB, instead of only when it is read:

diff --git a/src/itemdb.d b/src/itemdb.d
index f726d25..a42c987 100644
--- a/src/itemdb.d
+++ b/src/itemdb.d
@@ -657,6 +657,14 @@ final class ItemDatabase {
 
        private void bindItem(const ref Item item, ref Statement stmt) {
                with (stmt) with (item) {
+                       const string mtimeStr = mtime.toISOExtString();
+                       try {
+                               assert(isValidUTCDateTime(mtimeStr), "Bad mtime string: '" ~ mtimeStr ~ "'");
+                       } catch (Exception e) {
+                               writeln("Exception during isValidUTCDateTime assertion: '" ~ mtimeStr ~ "'");
+                               throw e; // Re-throw the exception after logging
+                       }
+
                        bind(1, driveId);
                        bind(2, id);
                        bind(3, name);
@@ -673,7 +681,7 @@ final class ItemDatabase {
                        bind(5, typeStr);
                        bind(6, eTag);
                        bind(7, cTag);
-                       bind(8, mtime.toISOExtString());
+                       bind(8, mtimeStr);
                        bind(9, parentId);
                        bind(10, quickXorHash);
                        bind(11, sha256Hash);

But this was not triggered. So either the corruption really happens inside the SQLite DB, or my patch just doesn't work as expected. As said, I'm never before used D and of course I'm not into your code base deep enough to judge if this is the right place to add such a check. And the right way to do it, either.

I see that items are downloaded multithreaded. Is there a way to force it to run with only a single thread, to rule out any concurrency issues?

@abraunegg
Copy link
Owner

@phlibi

But this was not triggered. So either the corruption really happens inside the SQLite DB, or my patch just doesn't work as expected. As said, I'm never before used D and of course I'm not into your code base deep enough to judge if this is the right place to add such a check. And the right way to do it, either.

It is also interesting, that outside of the initial report .. there are also zero other reports of this .... maybe even something 'locale' related ...

I see that items are downloaded multithreaded. Is there a way to force it to run with only a single thread, to rule out any concurrency issues?

See the 'threads' value - https://github.com/abraunegg/onedrive/blob/master/docs/application-config-options.md#threads

@advorsky73
Copy link

advorsky73 commented Sep 20, 2024

just to let you know....
i basically ran into the same issue yesterday:

Attempting to perform a database vacuum to optimise database
Database vacuum is complete
core.time.TimeException@/usr/lib/ldc/x86_64-linux-gnu/include/d/std/datetime/systime.d(9101): Invalid ISO Extended String: 2024-Sep-19 17:22:58.440

??:? [0x55e4b1223da5]
??:? [0x55e4b1220a31]
??:? [0x55e4b1220692]
??:? [0x55e4b1246eef]
??:? [0x55e4b1246f5d]
??:? [0x55e4b123660e]
??:? [0x55e4b122d6e7]
??:? [0x55e4b122cb8a]
??:? [0x55e4b12bde1a]
??:? [0x55e4b12bb54e]
??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x14a210b9db7b]
??:? _d_run_main2 [0x14a210b9d995]
??:? _d_run_main [0x14a210b9d7ed]
??:? [0x14a21081c249]
??:? __libc_start_main [0x14a21081c304]
??:? [0x55e4b1202100]

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

Some update or progress, whatever it is:

  • On the secondary computer, I set threads = "1" in the config and re-ran the initial sync. Worked without issues, now trying again to check.
  • On the primary computer, I checked out tag v2.4.25, rebuilt and re-ran the initial sync. Seemed to take more time than with the current version, but also worked without issues. I'm running this again as well.

Regarding the "zero other reports" (which is apparently no longer true, thanks @advorsky73 for sharing): I don't know what the real cause of the issue is, but since it only seems to (reproducibly) appear with an initial sync and maybe only with Sharepoint (not plain OneDrive) and maybe only with a certain number of files and probably only with the 2.5 version, there is plenty of potential to get around it, especially for existing users. Time will tell whether more users are affected.

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

After the second try, there is the same outcome: No issues with v2.5 and threads=1 and also no issues with v2.4.25. I guess I'll stick for now with the current version and threads=1, at least until there is more insight.

@abraunegg
Copy link
Owner

@advorsky73
Please can you provide the following details:

  • Operating System Details

@abraunegg
Copy link
Owner

@phlibi

After the second try, there is the same outcome: No issues with v2.5 and threads=1 and also no issues with v2.4.25. I guess I'll stick for now with the current version and threads=1, at least until there is more insight.

Very curious | interesting ....

In your debug logfile:

DEBUG: Threadsafe database value: 1

This is usually the 'default' this means that your SQLite library is compiled in serialized mode, the safest option for multithreaded use .. and should be used by default when opening the database file ... maybe it is not ...... I was relying on this to ensure the DB was being opened in a threadsafe manner.

I have updated the PR to ensure/force the database is opened in the safest multithreaded way.

Please can you rebuild the client to use onedrive v2.5.0-15-gfc87036 and increase your threads to the default.

@phlibi
Copy link
Author

phlibi commented Sep 20, 2024

Unfortunately no luck:

$ ./onedrive --version
onedrive v2.5.0-15-gfc87036
$ ./onedrive --confdir=~/.config/onedrive/pruefprotokolle --single-directory Pruefprotokolle --sync --verbose
...
Attempting to perform a database vacuum to optimise database
Database vacuum is complete
std.utf.UTFException@/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d(1556): Invalid UTF-8 sequence (at index 1)
----------------
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1594 pure dchar std.utf.decodeImpl!(true, 0, const(char)[]).decodeImpl(ref const(char)[], ref ulong) [0x564515b522e4]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d:1186 pure @trusted dchar std.utf.decode!(0, const(char)[]).decode(scope ref const(char)[], ref ulong) [0x564515b52257]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:827 pure @safe bool std.regex.internal.ir.Input!(char).Input.nextChar(ref dchar, ref ulong) [0x564515b27d0e]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:789 pure @trusted bool std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.next() [0x564515b2bd20]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/thompson.d:943 pure @trusted int std.regex.internal.thompson.ThompsonMatcher!(char, std.regex.internal.ir.Input!(char).Input).ThompsonMatcher.match(std.regex.internal.ir.Group!(ulong).Group[]) [0x564515b3012d]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 pure void std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).__lambda4!(std.regex.internal.ir.Group!(ulong).Group[]).__lambda4(std.regex.internal.ir.Group!(ulong).Group[]) [0x564515b460ac]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/internal/ir.d:1122 pure void std.regex.internal.ir.SmallFixedArray!(std.regex.internal.ir.Group!(ulong).Group, 3u).SmallFixedArray.mutate(scope void delegate(std.regex.internal.ir.Group!(ulong).Group[]) pure) [0x564515b371f6]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:775 ref @trusted std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.RegexMatch!(immutable(char)[]).RegexMatch.__ctor!(std.regex.internal.ir.Regex!(char).Regex).__ctor(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x564515b46012]
/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/regex/package.d:1013 @safe std.regex.RegexMatch!(immutable(char)[]).RegexMatch std.regex.match!(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex).match(immutable(char)[], std.regex.internal.ir.Regex!(char).Regex) [0x564515b45e79]
src/util.d:540 bool util.isValidUTCDateTime(immutable(char)[]) [0x564515b4cb5b]
src/itemdb.d:701 itemdb.Item itemdb.ItemDatabase.buildItem(sqlite.Statement.Result) [0x564515baad08]
src/itemdb.d:505 itemdb.Item[] itemdb.ItemDatabase.selectChildren(const(char)[], const(char)[]) [0x564515ba919f]
src/sync.d:6346 itemdb.Item[] syncEngine.SyncEngine.getChildren(immutable(char)[], immutable(char)[]) [0x564515b914dc]
src/sync.d:3115 void syncEngine.SyncEngine.performDatabaseConsistencyAndIntegrityCheck() [0x564515b7f99e]
src/main.d:1232 void main.performStandardSyncProcess(immutable(char)[], monitor.Monitor) [0x564515a2a422]
src/main.d:781 _Dmain [0x564515a274b1]

And to be sure:

$ ./onedrive --confdir=~/.config/onedrive/pruefprotokolle --single-directory Pruefprotokolle --sync --verbose --verbose | grep -i thread
DEBUG: Config option 'threads'                      = 8
DEBUG: SQLite Threadsafe database value: 1
DEBUG: Initialised TaskPool worker with threads: 8
DEBUG: SyncEngine: Waiting for all internal threads to complete
DEBUG: Shutting down processPool in a thread blocking manner
std.utf.UTFException@/home/phip/dlang/dmd-2.109.1/linux/bin64/../../src/phobos/std/utf.d(1556): Invalid UTF-8 sequence (at index 1)
...

I see that according to all logic and documentation, this should be just fine. But obviously it is not :(

Just to make sure that I'm not hunting a phantom, I did it twice (clearing all the files and config backup and DB before each run) and it failed twice. Then I did the same again with threads = "1" and it worked without an error. Then back without limiting threads and the error is back. It is really fairly consistent.

Maybe a dumb question since I don't know much about D and the application design: Is it possible, that your threads use individual SQLite objects, which don't share the serialization mutex and instead access the DB file concurrently? Or is there some other possibility that SQLite's own locking could be defeated by how it is used by the onedrive client?

@abraunegg
Copy link
Owner

abraunegg commented Sep 20, 2024

@phlibi

Maybe a dumb question since I don't know much about D and the application design: Is it possible, that your threads use individual SQLite objects, which don't share the serialization mutex and instead access the DB file concurrently? Or is there some other possibility that SQLite's own locking could be defeated by how it is used by the onedrive client?

The only threading is really with when it is looking at uploading|downloading actual files/data from Microsoft OneDrive.

Whilst this is set in 'main.d':

	// Set the default thread pool value
	defaultPoolThreads(to!int(appConfig.getValueLong("threads")));

This only governs the thread pools that is used when creating the threads for upload|download processes.

With onedrive v2.5.0-15-gfc87036 .. you can try to edit this line to read:

        // Set the default thread pool value
	defaultPoolThreads(to!int(1));

See if that makes a difference .....

@abraunegg
Copy link
Owner

@phlibi
Also - can you email me offline a complete hardware / platform / composition description?

I want to try an replicate as close as possible your system, given that I cannot (no matter how hard I try) replicate your issue using a Sharepoint folder and/or 50K worth of objects online and/or using Debian 12 (albeit in a VM, not using ZFS, but the VM, via VMware ESXi is off a data store served via NFS from a ZFS volume).

@phlibi
Copy link
Author

phlibi commented Sep 21, 2024

    // Set the default thread pool value

defaultPoolThreads(to!int(1));


See if that makes a difference .....

Just tried this and it still failed. I first thought that this would be exactly the same as just setting threads = "1" in the config file, but when grepping through the code, I see that this configuration variable is actually used multiple times, e.g. also in sync.d:

		processPool = new TaskPool(to!int(appConfig.getValueLong("threads")));
		addLogEntry("Initialised TaskPool worker with threads: " ~ to!string(processPool.size), ["debug"]);

Are there different thread pools in use?

Also - can you email me offline a complete hardware / platform / composition description?

Not much special there.
Main notebook:

  • Thinkpad T14 from 2023 with Ryzen 7 7840U and 32G RAM
  • Debian 12 Bookworm
  • All packages up-to-date
  • Root filesystem on ZFS with compression and encryption, otherwise default settings

Secondary notebook (from the top of my head, as I'm out of the office to check right now):

  • Thinkpad X230 (?) from about 2014 with some Intel i5 and 8GB RAM
  • Xubuntu 20.04
  • All packages up-to-date
  • Encrypted LVM and ext4 as created using the Xubuntu installer

At least from the client side, I don't think there is anything that would lead to my systems behaving differently than what you can easily replicate, be it with a VM or on bare metal.

I see the same effect no matter whether I'm connected to a 1GBit/s wired internet connection or an unstable link using my mobile phone.

Where I see most potential for differences is with the server stuff. We use the "Microsoft 365 Business Standard" subscription with close to 20 users. As far as I know, our data is hosted in Swiss or at least EU datacenters, or at least it is supposed to be. Since you already found some differences in behavior when the client connects to EU datacenters,

We organize the data mostly in Teams, with Sharepoint just used as the storage backend. To replicate our situation on the server side, it should be sufficient to create a Team in Teams and drop a number of directories and relatively small files into it. I currently synchronize a single directory of a Team/site with the following figures:

$ find /scratch/onedrive-pruefprotokolle/ -type f | wc
   2950    3047  309761
$ find /scratch/onedrive-pruefprotokolle/ -type d | wc
     28      28    1993
$ du -hd0 /scratch/onedrive-pruefprotokolle/
102M    /scratch/onedrive-pruefprotokolle/

~3000 files in 28 directores with a total size of ~100MB, or an average file size of ~33kB. This already reliably triggers the failure in, say, 90% of all initial synchronization runs. I haven't counted exactly, but would say that there happen to be 5-10 severely corrupted mtime entries in the DB file afterwards. This includes the ones flagged as "BLOB", but not those which contain a string of a bad format.

While the server/datacenter certainly can't cause such a crash on the client directly, I see several notifications of throttling (too many requests), leading to delays in all threads for sometimes up to ~200s. Since this happens more frequently for us when threads are used, it also greatly reduces the speed gain of multithreading, as the client then simply wastes much more time with waiting for the server to accept new requests. Do you see this effect as well? If not, it could be the reason why we see the corruption and you don't:

  1. The client starts downloading files in multiple threads
  2. Even though the files may be very similar in files, there will quickly be some skew between the threads, so that downloads complete at different times across all threads
  3. When the process runs this way, chances are low that two threads update the DB at the same time, due to this spread
  4. Now, whenever throttling occurs, it is afterwards like restarting the whole process: All threads begin downloading at the same time again and if they download files of similar sizes, they may also complete at nearly the same time, at least for each first file after the throttling
  5. When this happens frequently, chances are higher that two threads really write to the DB at the same time than when all threads can keep running concurrently with no "synchronization" from the server side.

Does this sound reasonable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants