Skip to content
This repository has been archived by the owner on Dec 20, 2024. It is now read-only.

feature: Implement stream mode as a new download and upload method #1447

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

qhb1001
Copy link
Contributor

@qhb1001 qhb1001 commented Jul 31, 2020

Ⅰ. Describe what this PR did

The main work is to implement the P2P stream mode in supernode side. To summarize, the streaming download process in supernode could be cateloged into two sections.

  • Maintain the sliding window in supernode.

    • Initialize: The size of window will be registered in the dfget task registration phrase. And then, the window state is recorded in the ProgressManager as the SyncMap indexed by ClientID. The size of window is staic once it is recorded by supernode.
    • Update: The dfget client will report to the supernode about the result of downloading for every piece. In the report API of supernode server, the window will be updated based on the received piece number and piece status. To be specific, the window keeps sliding until the unacknowledged piece. It should be noted that the supernode is using the sender window, which has the range of [una, una + wnd) .
    • Finish: Update the DeleteCID method of ProgressManager to delete the sliding window state if it is in stream mode.
  • Schedule the pieces according to the window state

    The only modification that I made in this part lies at the GetPieceProgressByCID method of ProgressManager. The available pieces in regular mode means the success pieces which are not running; while for the stream mode, the available pieces means the cached pieces which are not running. After the modification, the ProgressManager would only return the available pieces inside the window when stream mode is on.

    A new kind of piece status is created: UNCACHED. When the piece is downloaded successfully, I assume that it will sotred into the cache immediately. Afterwards, the piece maybe popped out of the cache. And in this case, the handler of supernode server deletePieceCache at supernode/server/router.go:101 would be called to change the state of the piece.

    It should be noted that, the scheduler in stream mode currently uses the same scheduler as the regular mode, which may demands future optimization.

Besides the above modifications, I also add two APIs in supernode server, which are addPieceCache and deletePieceCache. The dfget client calls these APIs to notify the supernode about the state of cache.

Ⅱ. Does this pull request fix one issue?

This PR is related to #1436 #1164.

Ⅲ. Why don't you add test cases (unit test/integration test)? (你真的觉得不需要加测试吗?)

The unit tests have been added.

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

@qhb1001 qhb1001 force-pushed the feature#streamMode branch 2 times, most recently from 4e8e239 to f895330 Compare July 31, 2020 01:19
@qhb1001 qhb1001 force-pushed the feature#streamMode branch from f895330 to 83b235a Compare August 16, 2020 19:21
@qhb1001 qhb1001 force-pushed the feature#streamMode branch from 717cbc1 to b250fb6 Compare August 16, 2020 19:52
@codecov-commenter
Copy link

Codecov Report

Merging #1447 into master will decrease coverage by 0.34%.
The diff coverage is 43.30%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1447      +/-   ##
==========================================
- Coverage   52.29%   51.95%   -0.35%     
==========================================
  Files         136      138       +2     
  Lines        9134     9506     +372     
==========================================
+ Hits         4777     4939     +162     
- Misses       3972     4170     +198     
- Partials      385      397      +12     
Impacted Files Coverage Δ
dfdaemon/seed/downloader.go 70.21% <0.00%> (ø)
dfget/config/config.go 89.28% <ø> (ø)
dfget/core/api/download_api.go 29.54% <0.00%> (-0.69%) ⬇️
dfget/core/api/uploader_api.go 0.00% <0.00%> (ø)
dfget/core/core.go 39.44% <0.00%> (-4.89%) ⬇️
dfget/core/uploader/uploader.go 88.03% <ø> (ø)
dfget/types/register_request.go 0.00% <ø> (ø)
supernode/daemon/mgr/progress/progress_delete.go 0.00% <0.00%> (ø)
supernode/daemon/mgr/task/manager.go 27.73% <0.00%> (-0.84%) ⬇️
supernode/server/0.3_bridge.go 0.00% <0.00%> (ø)
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4906cd1...b250fb6. Read the comment docs.

@qhb1001 qhb1001 changed the title feature: support stream mode in supernode feature: Implement stream mode as a new download and upload method Aug 17, 2020
qhb1001 and others added 3 commits August 26, 2020 17:42
Signed-off-by: Hongbo Qin <hongbo.qin.1001@gmail.com>
Signed-off-by: Hongbo Qin <hongbo.qin.1001@gmail.com>
Signed-off-by: Hongbo <qin.ho@northeastern.edu>
@qhb1001 qhb1001 force-pushed the feature#streamMode branch from b250fb6 to 8ac634a Compare August 27, 2020 00:43
Signed-off-by: Hongbo <qin.ho@northeastern.edu>
@qhb1001 qhb1001 force-pushed the feature#streamMode branch from 8ac634a to 8acfce9 Compare August 27, 2020 00:43
@qhb1001
Copy link
Contributor Author

qhb1001 commented Aug 27, 2020

The stream mode is fullfilled with three more perpectives, which are DFGET downloader, DFGET uploader, and the IPC between user and DFGET.

DFGET Downloader

Since the client stream writer has been implemented under the p2p_downloader folder, which means that the pieces downloading process has been finished. My work here is to fullfill the task of handling the successfully downloaded pieces to uploader. Besides that, in the perparation phase at dfget/core/core.go:171, the registration of stream task is necessary for uploader. Here are all the modifications:

  • dfget/core/core.go:210: The doDownload method is refactored. One new interface is defined as DownloadTimeoutTask. The interface has two implementations, which are regular downloader and the stream downloader.

  • dfget/core/downloader/downloader.go:79: The Start() method of the StreamDownloadTimeoutTask struct will call the RunStream method of the downloader, and then pass the stream reader to the startWriter method.

  • dfget/core/downloader/downloader_util.go:55: The startWriter method is used to fetch the pieces from the stream reader concurrently, and then upload the pieces to uploader in order. The logic of the startWriter is referenced from the stream downloading method in CDN manager at supernode/daemon/mgr/cdn/super_writer.go:59.

    The unit test has been added.

DEGET Uploader

The previous uploader is solely used for the regular downloading method. For the stream uploader, I have added new APIs for it. Apart from that, the cache manager is added to manage the cache pieces.

  • dfget/core/api/supernode_api.go:359: A new kind of updating piece status API has been implemented. It is used to change the peice status from SUCCESSFUL to UNCACHED.

  • dfget/core/api/uploader_api.go:108: The uploader API RegisterStreamTask is added to register the stream task at cache manager. The initialization of the according entry at cache manager is finshed after the call.

  • dfget/core/api/uploader_api.go:127: The uploader API DeliverPieceToUploader is used to handle the piece from downloader to uploader.

  • dfget/core/uploader/cache.go:60: The FIFO cache manager is created to control the piece caches. It would store and pop the piece in the style of FIFO. It should be noted that the cache manager is using the sender window, which has the range of [start, una).

    The unit test has been added.

  • TODO: GC for cache manager.

IPC Between DFGET and User

This part demands further discussion. Currently, I propose that the DFGET can directly output the content of the pieces to stdout, and then the user who calls the DFGET command, can redirect the stdout of the DFGET command to the write side of the pipe by generally supported method popen.

After that, the user can directly read from the pipe, and get the successfully downloaded content. Since the method uses the unnamed pipe, the whole process is not related with the file system.

@qhb1001
Copy link
Contributor Author

qhb1001 commented Aug 27, 2020

@garfield009 @ansinlee @starnop PTAL

@@ -114,3 +114,65 @@ func (s *DFGetP2PTestSuite) TestDownloadFile(c *check.C) {
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration test here is annotated due to the reason that the detailed implementation of IPC between user and DFGET demands further discussion.

@jinuxstyle
Copy link

Not clear what problem the PR solves. Performance? Better rate limiting? or else ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants