Question about writing to SQLite database file while writing to USB drive?

Gary1 · 5 November 2024 02:57

I suspect that this is not something that can be easily answered but this recent post on the SQLite forum sounds a bit related to the answers received to a question I asked here recently. It’s posts 3 through 6.

If I understand the SQLite post correctly, the claim is that an application reading/writing to a SQLite database file, while the same computer is writing a large amount of data to a USB drive, can experience long delays because writing to the USB drive caused the application to have to wait for its fsync request to be fulfilled.

Keeping with the previous posts here, if I were to write 6 GB to a USB drive and have 32 GB of RAM, the full 6 GB can be cached in RAM since it’s less than 20%. Does this mean that a write to a SQLite file can’t complete until that 6 GB is fully transferred to the USB drive?

If these items are related, how does udev-usb-sync affect them?

If I use SQLite in an application should there be any difference between trying to test this directly in the application and testing writes in the SQLite CLI while writing to a USB drive?

Thank you.

BG405 · 5 November 2024 06:26

I’m not quite sure what you mean by this.

Are you trying to write data which is still being updated to the source?

… In that case, there’s no choice but to wait for any modifications being done to the source data to be complete before attempting transfer to any destination. Unless it’s some sort of streaming you are doing?

linux-aarhus · 5 November 2024 06:27

I don’t completely follow your thought process.

udev-usb-sync is designed to eliminate the gap between cache-to-ram and write-to-usb by limiting the cache and enable sync-writes.

This should be understood as when the work is done - it is done - and data is completely written to disk - allowing for immediate removal of the device without dataloss.

If you are running any application either on the USB stick or targeting the USB stick - in your example - a sqlite database stored on the stick - the udev rules will affect that as well.

Depending on your USB port and the USB device in question this may cause a very long application execution time as the application will have to wait for the data to be written to USB before the task can be considered complete.

With very slow USB ports and devices this may be unacceptable - it may even seem the process is hung.

BG405 · 5 November 2024 06:35

I was getting a little confused here, thinking this was being treated like a “live update” using the USB storage at the same time. In any case, the sync command may be useful here.

I’ve had to put up with very slow transfer rates with certain USB devices (flash sticks); SATA to USB seems blazingly fast.

papajoke · 5 November 2024 06:53

or a workaround? use /tmp is acceptable ?

a bash file of the type:

make a backup of /mnt/usb/datas/file.db
copy /mnt/usb/datas/file.db to /tmp and remove it
symbolic link /tmp/file.db to /mnt/usb/datas/file.db
launch application
if ok: copy /tmp/files.db to /mnt/usb/datas/file.db

Gary1 · 5 November 2024 22:58

I probably didn’t express it very well. The application is writing to a SQLite database on the machine’s hard drive. However, this is attempted after a long write (not by the application and not even a SQLite file) to a USB drive has been commenced. When this condition was met, the originators of those few posts found that the application’s data didn’t reach disk until after the write to the USB completed and this caused a delay in the application because it waits (I assume for the OS) to inform it that the data made it to disk rather than cache.

I’ve never experienced this because I’ve never done it. I usually do nothing else until the write to the USB drive has finished and it can removed.

I have an application that uses SQLite a great deal and will try to test this scenario but wondered if it is even plausible, whether anyone here ever experienced it and, if so, how was it handled.

Or, maybe I misunderstood those posts and have it all wrong.

Thank you.

linux-aarhus · 6 November 2024 06:55

So - the speculation is - to separate processes

process X copy a huge amount of data to a slow USB
process Y is using a sqlite database

The speculation is - does these processes interfere with eachother when it comes to writing data to the respective destinations.

Is that correct?

If it is - then the assumption that the Y process could be delayed by the X process is very far fetched - and unlikely to happen in real world.

If the speculative situation occur - the udev-usb-sync package will eliminate the issue as the copy process will stream to disk and only cache a few MB at any given time (calculated from USB port speed).

If the speculative situation occor without the mentioned sync rules/script then the cache would be on two different applications and the writing to disk is to different devices and the processes should not interfere with each other.

Gary1 · 6 November 2024 23:10

Thank you for the explanation. I read those posts over again and can only conclude that this is the case, because the application and the write to USB could not be writing to the same database file since it is two different storage devices, and, therefore, it could not be a SQLite busy issue.

It might make sense (to a novice like me) if there were only one fsync queue such that the fysnc needed for the SQLite transaction couldn’t take place until the fsync of the USB drive took place/completed. But that is well over my head at this time.

My application uses JavaScript in the GUI and sends a request over a local network web socket to a Tcl server channel that then uses the Tcl API to SQLite to make the database request. The GUI does not block but uses a promise-based communication structure to handle the response when it arrives. I have no idea in what thread all of this takes place relative to an in-progress USB drive write. I understand that (apart from web workers) JavsScript is single threaded, but the promise itself and the Tcl/SQLite operations that follow I suspect are in another thread(s) and the response goes on the JavaScript stack when it arrives and invokes the promise’s resolve/reject handler.

I assume that all I can do to test this scenario is write something large to the USB drive and read and write via the application to its databases and see what happens.

Thanks.