Skip to content

Commit

Permalink
xftp: add sending and receiving via URI-encoded redirects (#968)
Browse files Browse the repository at this point in the history
* xftp: add URI encoding for FileDescription

* tweak URI

* allow smaller blocks

* draft xftpReceiveFileFollow' and xftpSendFilePublic'

* add sending with redirect

* allow 64k chunks

* add migrations with redirect fields

* add test case

* fix deadlock

* revert CLI code

* WIP: working send/receive via URI

* fix field ambiguity

* cleanup

* update agent db schema

* update minimal chunk size

* add rfc

* apply suggestions from code review

Co-authored-by: Evgeny Poberezkin <evgeny@poberezkin.com>

* add createRcvFileRedirect

* extract Simplex.Messaging.ServiceScheme and reuse for files

* update db schema

* check size/digest on receive complete

* cleanup

* use SIZE/DIGEST errors for redirects too

* split digest/size errors from redirect checks

* fix redirect error encoding

* rename RedirectMeta to RedirectFileInfo

* use query encoding for file URI

* group maybe fields under RcvFileRedirect

* add extras field

* update rfc

* add extras encoding and no-redirect tests

* fix toStrict for old ghc

* extra client data in file descr URI

* remove decoded yaml file

---------

Co-authored-by: Evgeny Poberezkin <evgeny@poberezkin.com>
  • Loading branch information
dpwiz and epoberezkin committed Feb 21, 2024
1 parent 2875d90 commit 65aa290
Show file tree
Hide file tree
Showing 21 changed files with 558 additions and 127 deletions.
73 changes: 73 additions & 0 deletions rfcs/2024-01-26-file-links.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Sending large file descriptions

It is desirable to provide a QR code/URI from which a file can be downloaded. This way files may be addressed outside a chat client.
Currently the `xftp` CLI tool can generate YAML file descriptions that can be used to receive a file.
It is possible to pass such a description as an URI, but descriptions for files larger than ~8 MBs (two 4 MB chunks) would give QR codes that are difficult to process.
A user can manually upload description and get a shorter one. Typically descriptions for files that are up to ~20 GBs would still be small enough to not require another pass, and that is way beyond any current (or, reasonable, fwiw) limitations.

It is possible to streamline this process, so any application using simplexmq agent can easily send file descriptions and follow redirects.
A file description with a redirect contains an extra field with final file size and digest so it can be followed automatically.

The flow would be like this:

- Sending:
1. Upload file as usual with `xftpSendFile`, get recipient file descriptions in `SFDONE` message.
2. Upload one of the file descriptions with `xftpSendDescription`, get its redirect-description in its `SFDONE` message.
3. Wrap in `FileDescriptionURI` and use `strEncode` to get a QR-sized URI.
4. Show QR code / copy link.
- Receiving:
1. Scan QR code / paste link.
2. Use `strDecode` and unwrap `FileDescriptionURI` to get `ValidFileDescription 'FRecipient`.
3. Download it as usual with `xftpReceiveFile`, getting `RFDONE` message when the file is fully received.

It is not necesary to use redirect description if original description can be encoded to fit in 1002 characters. Beyond this size there is a significant jump in QR code complexity.
It is possible to call `encodeFileDescriptionURI` right after upload to test if the URI fits and skip step 2.
When `xftpReceiveFile` receives a decoded description that lacks `redirect` field, the procedure for downloading a file is the same as usual - download chunks and reassemble local file.

## Agent changes

### Sending

Sending and receiving files in agent is a multi-step process mediated by DB entries in `snd_files` and `rcv_files` tables.

`xftpSendDescription` is tasked with storing original description in a temporary locally-encrypted file, then creating upload task for it.

It is necessary to preserve redirect metadata so it can be attached to descriptions in the `SFDONE` message sent by a worker:

```sql
ALTER TABLE snd_files ADD COLUMN redirect_size INTEGER;
ALTER TABLE snd_files ADD COLUMN redirect_digest BLOB;
```

### Receiving

`xftpReceiveFile` gets a file description as an argument and knows if it should follow redirect procedure or run an ordinary download.
For redirects it will prepare a `RcvFile` for redirect and then a placeholder, for the final file.
Agent messages would be sent using the entity ID of the final file, which is stored along with redirect metadata in `RcvFile` for the redirect.

```sql
ALTER TABLE rcv_files ADD COLUMN redirect_id INTEGER REFERENCES rcv_files ON DELETE CASCADE; -- for later updates
ALTER TABLE rcv_files ADD COLUMN redirect_entity_id BLOB; -- for notifications
ALTER TABLE rcv_files ADD COLUMN redirect_size INTEGER;
ALTER TABLE rcv_files ADD COLUMN redirect_digest BLOB;
```

These additional fields will exist on the file that is a short description to receive an actual description of the final file.

While a description YAML is being downloaded, the application will get `RFPROG` messages tagged for final entity, containing bytes downloaded so far and the total size from the original file.
When the description is fully downloaded, the worker would decode description and check if the stated size and digest match the declared in redirect.
Then it will replace placeholder description in `rcv_files` for destination file with the actual data from downloaded description.
Finally, instead of sending `RFDONE` for redirect, it hands over work to chunk download worker, which will run exactly as if the user requested its download directly.
An application will then receive `RFPROG` and `RFDONE` messages as usual.

## URI encoding

File description URIs use the same service schema `simplex:` or its `https://simplex.chat` (or any custom host) equivalent as do contact links and can be extracted from text and processed the same way.
The path section is `/file` (with an optional trailing `/`).
The payload is encoded in the "fragment" part of the link, using `#/?`, followed by a query string.
File description is encoded first in a YAML document, then URL-encoded in under the key `d`.
An application may want to pass extra parameters not necessary to download a file. Those go in the `_` key, encoded as a JSON dictionary.

An example link:

`simplex:/file#/?d=chunkSize%3A%2064kb%0Adigest%3A%20OtpnXkECTW4a18Eots2m3O22maeOCMqPUX4ulugIjgMEJfCpTYc_-T257Uw7s9bW_F0G5WBg5BioBWd4Z_OoCw%3D%3D%0Akey%3A%20rNR8_2SJuH7Qve43gV3zszL0R6oY5HSdRZT_paB-wfE%3D%0Anonce%3A%202oKwfK-w75nwyWp8_1Lv6QnQonIRtJmG%0Aparty%3A%20recipient%0Areplicas%3A%0A-%20chunks%3A%0A%20%20-%201%3ATdvaxMnG2Ph1e3QCx3-rpA%3D%3D%3AMC4CAQAwBQYDK2VwBCIEILdErEICvgrBCajDLTX2h3LXyMB7z5vrtLa3XVigJuf-%3ANS46KuYdgOWs6dUeMp7p2oF8rBQ9wQ2Ez6TW6Y6gHg0%3D%0A%20%20-%202%3AH5SRbtKYrXWVXTthrkeWzw%3D%3D%3AMC4CAQAwBQYDK2VwBCIEIGeEPNLt7lUGPfplwsoJLCDFnbIc5Hm31kz5X6rWXmgu%3A7QNRI-gvFx9UM-baXp3YVDli9pcfh3HGFKDhsA9JQHY%3D%0A%20%20-%203%3A_xjukkIl9WZFryUXT0h_TQ%3D%3D%3AMC4CAQAwBQYDK2VwBCIEIIRFBaL1HvUfePvKLuggwUrC_q_ZHd7v08IL9jhM7teC%3Aid2lgLMMjTGsR8SUogJuRdLoEHAc5SDQKFDqlZRSuEY%3D%0A%20%20server%3A%20xftp%3A%2F%2FLcJUMfVhwD8yxjAiSaDzzGF3-kLG4Uh0Fl_ZIjrRwjI%3D%40localhost%3A7002%0Asize%3A%20192kb%0A&_=%7B%22k%22:%22test%22%7D`
2 changes: 2 additions & 0 deletions simplexmq.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,7 @@ library
Simplex.Messaging.Agent.Store.SQLite.Migrations.M20231222_command_created_at
Simplex.Messaging.Agent.Store.SQLite.Migrations.M20231225_failed_work_items
Simplex.Messaging.Agent.Store.SQLite.Migrations.M20240121_message_delivery_indexes
Simplex.Messaging.Agent.Store.SQLite.Migrations.M20240124_file_redirect
Simplex.Messaging.Agent.TRcvQueues
Simplex.Messaging.Client
Simplex.Messaging.Client.Agent
Expand Down Expand Up @@ -142,6 +143,7 @@ library
Simplex.Messaging.Server.QueueStore.STM
Simplex.Messaging.Server.Stats
Simplex.Messaging.Server.StoreLog
Simplex.Messaging.ServiceScheme
Simplex.Messaging.TMap
Simplex.Messaging.Transport
Simplex.Messaging.Transport.Buffer
Expand Down
97 changes: 76 additions & 21 deletions src/Simplex/FileTransfer/Agent.hs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TupleSections #-}
{-# LANGUAGE TypeApplications #-}
{-# OPTIONS_GHC -fno-warn-ambiguous-fields #-}

Expand All @@ -19,6 +20,7 @@ module Simplex.FileTransfer.Agent
xftpDeleteRcvFile',
-- Sending files
xftpSendFile',
xftpSendDescription',
deleteSndFileInternal,
deleteSndFileRemote,
)
Expand All @@ -44,6 +46,7 @@ import Simplex.FileTransfer.Client.Main
import Simplex.FileTransfer.Crypto
import Simplex.FileTransfer.Description
import Simplex.FileTransfer.Protocol (FileParty (..), SFileParty (..))
import qualified Simplex.FileTransfer.Protocol as XFTP
import Simplex.FileTransfer.Transport (XFTPRcvChunkSpec (..))
import Simplex.FileTransfer.Types
import Simplex.FileTransfer.Util (removePath, uniqueCombine)
Expand All @@ -57,6 +60,7 @@ import Simplex.Messaging.Crypto.File (CryptoFile (..), CryptoFileArgs)
import qualified Simplex.Messaging.Crypto.File as CF
import qualified Simplex.Messaging.Crypto.Lazy as LC
import Simplex.Messaging.Encoding
import Simplex.Messaging.Encoding.String (strDecode, strEncode)
import Simplex.Messaging.Protocol (EntityId, XFTPServer)
import Simplex.Messaging.Util (liftError, tshow, unlessM, whenM)
import System.FilePath (takeFileName, (</>))
Expand Down Expand Up @@ -97,7 +101,7 @@ closeXFTPAgent a = do
stopWorkers workers = atomically (swapTVar workers M.empty) >>= mapM_ (liftIO . cancelWorker)

xftpReceiveFile' :: AgentMonad m => AgentClient -> UserId -> ValidFileDescription 'FRecipient -> Maybe CryptoFileArgs -> m RcvFileId
xftpReceiveFile' c userId (ValidFileDescription fd@FileDescription {chunks}) cfArgs = do
xftpReceiveFile' c userId (ValidFileDescription fd@FileDescription {chunks, redirect}) cfArgs = do
g <- asks random
prefixPath <- getPrefixPath "rcv.xftp"
createDirectory prefixPath
Expand All @@ -107,14 +111,25 @@ xftpReceiveFile' c userId (ValidFileDescription fd@FileDescription {chunks}) cfA
createDirectory =<< toFSFilePath relTmpPath
createEmptyFile =<< toFSFilePath relSavePath
let saveFile = CryptoFile relSavePath cfArgs
fId <- withStore c $ \db -> createRcvFile db g userId fd relPrefixPath relTmpPath saveFile
forM_ chunks downloadChunk
fId <- case redirect of
Nothing -> withStore c $ \db -> createRcvFile db g userId fd relPrefixPath relTmpPath saveFile
Just _ -> do
-- prepare description paths
let relTmpPathRedirect = relPrefixPath </> "xftp.redirect-encrypted"
relSavePathRedirect = relPrefixPath </> "xftp.redirect-decrypted"
createDirectory =<< toFSFilePath relTmpPathRedirect
createEmptyFile =<< toFSFilePath relSavePathRedirect
cfArgsRedirect <- atomically $ CF.randomArgs g
let saveFileRedirect = CryptoFile relSavePathRedirect $ Just cfArgsRedirect
-- create download tasks
withStore c $ \db -> createRcvFileRedirect db g userId fd relPrefixPath relTmpPathRedirect saveFileRedirect relTmpPath saveFile
forM_ chunks (downloadChunk c)
pure fId
where
downloadChunk :: AgentMonad m => FileChunk -> m ()
downloadChunk FileChunk {replicas = (FileChunkReplica {server} : _)} = do
void $ getXFTPRcvWorker True c (Just server)
downloadChunk _ = throwError $ INTERNAL "no replicas"

downloadChunk :: AgentMonad m => AgentClient -> FileChunk -> m ()
downloadChunk c FileChunk {replicas = (FileChunkReplica {server} : _)} = do
void $ getXFTPRcvWorker True c (Just server)
downloadChunk _ _ = throwError $ INTERNAL "no replicas"

getPrefixPath :: AgentMonad m => String -> m FilePath
getPrefixPath suffix = do
Expand Down Expand Up @@ -172,14 +187,17 @@ runXFTPRcvWorker c srv Worker {doWork} = do
relChunkPath = fileTmpPath </> takeFileName chunkPath
agentXFTPDownloadChunk c userId digest replica chunkSpec
atomically $ waitUntilForeground c
(complete, progress) <- withStore c $ \db -> runExceptT $ do
(entityId, complete, progress) <- withStore c $ \db -> runExceptT $ do
liftIO $ updateRcvFileChunkReceived db (rcvChunkReplicaId replica) rcvChunkId relChunkPath
RcvFile {size = FileSize total, chunks} <- ExceptT $ getRcvFile db rcvFileId
RcvFile {size = FileSize currentSize, chunks, redirect} <- ExceptT $ getRcvFile db rcvFileId
let rcvd = receivedSize chunks
complete = all chunkReceived chunks
(entityId, total) = case redirect of
Nothing -> (rcvFileEntityId, currentSize)
Just RcvFileRedirect {redirectFileInfo = RedirectFileInfo {size = FileSize finalSize}, redirectEntityId} -> (redirectEntityId, finalSize)
liftIO . when complete $ updateRcvFileStatus db rcvFileId RFSReceived
pure (complete, RFPROG rcvd total)
notify c rcvFileEntityId progress
pure (entityId, complete, RFPROG rcvd total)
notify c entityId progress
when complete . void $
getXFTPRcvWorker True c Nothing
where
Expand Down Expand Up @@ -223,20 +241,41 @@ runXFTPRcvLocalWorker c Worker {doWork} = do
\f@RcvFile {rcvFileId, rcvFileEntityId, tmpPath} ->
decryptFile f `catchAgentError` (rcvWorkerInternalError c rcvFileId rcvFileEntityId tmpPath . show)
decryptFile :: RcvFile -> m ()
decryptFile RcvFile {rcvFileId, rcvFileEntityId, key, nonce, tmpPath, saveFile, status, chunks} = do
decryptFile RcvFile {rcvFileId, rcvFileEntityId, size, digest, key, nonce, tmpPath, saveFile, status, chunks, redirect} = do
let CryptoFile savePath cfArgs = saveFile
fsSavePath <- toFSFilePath savePath
when (status == RFSDecrypting) $
whenM (doesFileExist fsSavePath) (removeFile fsSavePath >> createEmptyFile fsSavePath)
withStore' c $ \db -> updateRcvFileStatus db rcvFileId RFSDecrypting
chunkPaths <- getChunkPaths chunks
encSize <- liftIO $ foldM (\s path -> (s +) . fromIntegral <$> getFileSize path) 0 chunkPaths
when (FileSize encSize /= size) $ throwError $ XFTP XFTP.SIZE
encDigest <- liftIO $ LC.sha512Hash <$> readChunks chunkPaths
when (FileDigest encDigest /= digest) $ throwError $ XFTP XFTP.DIGEST
let destFile = CryptoFile fsSavePath cfArgs
void $ liftError (INTERNAL . show) $ decryptChunks encSize chunkPaths key nonce $ \_ -> pure destFile
notify c rcvFileEntityId $ RFDONE fsSavePath
forM_ tmpPath (removePath <=< toFSFilePath)
atomically $ waitUntilForeground c
withStore' c (`updateRcvFileComplete` rcvFileId)
case redirect of
Nothing -> do
notify c rcvFileEntityId $ RFDONE fsSavePath
forM_ tmpPath (removePath <=< toFSFilePath)
atomically $ waitUntilForeground c
withStore' c (`updateRcvFileComplete` rcvFileId)
Just RcvFileRedirect {redirectFileInfo, redirectDbId} -> do
let RedirectFileInfo {size = redirectSize, digest = redirectDigest} = redirectFileInfo
forM_ tmpPath (removePath <=< toFSFilePath)
atomically $ waitUntilForeground c
withStore' c (`updateRcvFileComplete` rcvFileId)
-- proceed with redirect
yaml <- liftError (INTERNAL . show) (CF.readFile $ CryptoFile fsSavePath cfArgs) `finally` (toFSFilePath fsSavePath >>= removePath)
next@FileDescription {chunks = nextChunks} <- case strDecode (LB.toStrict yaml) of
Left _ -> throwError . XFTP $ XFTP.REDIRECT "decode error"
Right (ValidFileDescription fd@FileDescription {size = dstSize, digest = dstDigest})
| dstSize /= redirectSize -> throwError . XFTP $ XFTP.REDIRECT "size mismatch"
| dstDigest /= redirectDigest -> throwError . XFTP $ XFTP.REDIRECT "digest mismatch"
| otherwise -> pure fd
-- register and download chunks from the actual file
withStore c $ \db -> updateRcvFileRedirect db redirectDbId next
forM_ nextChunks (downloadChunk c)
where
getChunkPaths :: [RcvFileChunk] -> m [FilePath]
getChunkPaths [] = pure []
Expand Down Expand Up @@ -268,7 +307,23 @@ xftpSendFile' c userId file numRecipients = do
key <- atomically $ C.randomSbKey g
nonce <- atomically $ C.randomCbNonce g
-- saving absolute filePath will not allow to restore file encryption after app update, but it's a short window
fId <- withStore c $ \db -> createSndFile db g userId file numRecipients relPrefixPath key nonce
fId <- withStore c $ \db -> createSndFile db g userId file numRecipients relPrefixPath key nonce Nothing
void $ getXFTPSndWorker True c Nothing
pure fId

xftpSendDescription' :: forall m. AgentMonad m => AgentClient -> UserId -> ValidFileDescription 'FRecipient -> m SndFileId
xftpSendDescription' c userId (ValidFileDescription fdDirect@FileDescription {size, digest}) = do
g <- asks random
prefixPath <- getPrefixPath "snd.xftp"
createDirectory prefixPath
let relPrefixPath = takeFileName prefixPath
let directYaml = prefixPath </> "direct.yaml"
cfArgs <- atomically $ CF.randomArgs g
let file = CryptoFile directYaml (Just cfArgs)
liftError (INTERNAL . show) $ CF.writeFile file (LB.fromStrict $ strEncode fdDirect)
key <- atomically $ C.randomSbKey g
nonce <- atomically $ C.randomCbNonce g
fId <- withStore c $ \db -> createSndFile db g userId file 1 relPrefixPath key nonce $ Just RedirectFileInfo {size, digest}
void $ getXFTPSndWorker True c Nothing
pure fId

Expand Down Expand Up @@ -423,15 +478,15 @@ runXFTPSndWorker c srv Worker {doWork} = do
sndFileToDescrs :: SndFile -> m (ValidFileDescription 'FSender, [ValidFileDescription 'FRecipient])
sndFileToDescrs SndFile {digest = Nothing} = throwError $ INTERNAL "snd file has no digest"
sndFileToDescrs SndFile {chunks = []} = throwError $ INTERNAL "snd file has no chunks"
sndFileToDescrs SndFile {digest = Just digest, key, nonce, chunks = chunks@(fstChunk : _)} = do
sndFileToDescrs SndFile {digest = Just digest, key, nonce, chunks = chunks@(fstChunk : _), redirect} = do
let chunkSize = FileSize $ sndChunkSize fstChunk
size = FileSize $ sum $ map (fromIntegral . sndChunkSize) chunks
-- snd description
sndDescrChunks <- mapM toSndDescrChunk chunks
let fdSnd = FileDescription {party = SFSender, size, digest, key, nonce, chunkSize, chunks = sndDescrChunks}
let fdSnd = FileDescription {party = SFSender, size, digest, key, nonce, chunkSize, chunks = sndDescrChunks, redirect = Nothing}
validFdSnd <- either (throwError . INTERNAL) pure $ validateFileDescription fdSnd
-- rcv descriptions
let fdRcv = FileDescription {party = SFRecipient, size, digest, key, nonce, chunkSize, chunks = []}
let fdRcv = FileDescription {party = SFRecipient, size, digest, key, nonce, chunkSize, chunks = [], redirect}
fdRcvs = createRcvFileDescriptions fdRcv chunks
validFdRcvs <- either (throwError . INTERNAL) pure $ mapM validateFileDescription fdRcvs
pure (validFdSnd, validFdRcvs)
Expand Down
10 changes: 5 additions & 5 deletions src/Simplex/FileTransfer/Client/Main.hs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ module Simplex.FileTransfer.Client.Main
CLIError (..),
xftpClientCLI,
cliSendFile,
cliSendFileOpts,
prepareChunkSizes,
prepareChunkSpecs,
maxFileSize,
Expand Down Expand Up @@ -297,8 +298,8 @@ cliSendFileOpts SendOptions {filePath, outputDir, numRecipients, xftpServers, re
withExceptT (CLIError . show) $ encryptFile srcFile fileHdr key nonce fileSize' encSize encPath
digest <- liftIO $ LC.sha512Hash <$> LB.readFile encPath
let chunkSpecs = prepareChunkSpecs encPath chunkSizes
fdRcv = FileDescription {party = SFRecipient, size = FileSize encSize, digest = FileDigest digest, key, nonce, chunkSize = FileSize defChunkSize, chunks = []}
fdSnd = FileDescription {party = SFSender, size = FileSize encSize, digest = FileDigest digest, key, nonce, chunkSize = FileSize defChunkSize, chunks = []}
fdRcv = FileDescription {party = SFRecipient, size = FileSize encSize, digest = FileDigest digest, key, nonce, chunkSize = FileSize defChunkSize, chunks = [], redirect = Nothing}
fdSnd = FileDescription {party = SFSender, size = FileSize encSize, digest = FileDigest digest, key, nonce, chunkSize = FileSize defChunkSize, chunks = [], redirect = Nothing}
logInfo $ "encrypted file to " <> tshow encPath
pure (encPath, fdRcv, fdSnd, chunkSpecs, encSize)
uploadFile :: TVar ChaChaDRG -> [XFTPChunkSpec] -> TVar [Int64] -> Int64 -> ExceptT CLIError IO [SentFileChunk]
Expand Down Expand Up @@ -526,9 +527,8 @@ prepareChunkSizes size' = prepareSizes size'
where
(smallSize, bigSize)
| size' > size34 chunkSize3 = (chunkSize2, chunkSize3)
| otherwise = (chunkSize1, chunkSize2)
-- | size' > size34 chunkSize2 = (chunkSize1, chunkSize2)
-- | otherwise = (chunkSize0, chunkSize1)
| size' > size34 chunkSize2 = (chunkSize1, chunkSize2)
| otherwise = (chunkSize0, chunkSize1)
size34 sz = (fromIntegral sz * 3) `div` 4
prepareSizes 0 = []
prepareSizes size
Expand Down
Loading

0 comments on commit 65aa290

Please sign in to comment.