Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a race condition with initial secret_key.py creation #6775

Merged
merged 1 commit into from
Sep 5, 2023

Conversation

SpecLad
Copy link
Contributor

@SpecLad SpecLad commented Aug 31, 2023

Motivation and context

Currently, it's possible (even if unlikely) for multiple backend processes to create and use different Django SECRET_KEY values, because the following scenario can happen:

  • process 1: fail to import secret_key.py
  • process 2: fail to import secret_key.py
  • process 1: generate a new secret_key.py
  • process 1: import secret_key.py
  • process 2: generate a new secret_key.py
  • process 2: import secret_key.py

Fix this by making it so that secret_key.py is created atomically, and never overwritten if it already exists.

In addition, only generate the secret key if the import fails due to the module not being found, since other failure reasons suggest incorrect configuration or data corruption, and so require administrator attention.

How has this been tested?

Manual testing.

Checklist

  • I submit my changes into the develop branch
  • I have added a description of my changes into the CHANGELOG file
  • [ ] I have updated the documentation accordingly
  • [ ] I have added tests to cover my changes
  • [ ] I have linked related issues (see GitHub docs)
  • [ ] I have increased versions of npm packages if it is necessary
    (cvat-canvas,
    cvat-core,
    cvat-data and
    cvat-ui)

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.

@SpecLad SpecLad marked this pull request as ready for review August 31, 2023 15:32
@SpecLad SpecLad requested a review from azhavoro August 31, 2023 15:32
@codecov
Copy link

codecov bot commented Aug 31, 2023

Codecov Report

Merging #6775 (dc999b3) into develop (2dd4f25) will increase coverage by 0.01%.
Report is 3 commits behind head on develop.
The diff coverage is 98.24%.

@@             Coverage Diff             @@
##           develop    #6775      +/-   ##
===========================================
+ Coverage    82.43%   82.45%   +0.01%     
===========================================
  Files          366      366              
  Lines        39833    39761      -72     
  Branches      3545     3545              
===========================================
- Hits         32835    32783      -52     
+ Misses        6998     6978      -20     
Components Coverage Δ
cvat-ui 77.40% <ø> (-0.01%) ⬇️
cvat-server 86.87% <98.24%> (+0.05%) ⬆️

@nmanovic
Copy link
Contributor

nmanovic commented Sep 5, 2023

@SpecLad , let's fix the conflict and simplify the code as discussed. Ping me in slack when it is ready.

Currently, it's possible (even if unlikely) for multiple backend processes
to create and use different Django `SECRET_KEY` values, because the
following scenario is possible:

* process 1: fail to import secret_key.py
* process 2: fail to import secret_key.py
* process 1: generate a new secret_key.py
* process 1: import secret_key.py
* process 2: generate a new secret_key.py
* process 2: import secret_key.py

Fix this by making it so that `secret_key.py` is created atomically, and
never overwritten if it already exists.

In addition, only generate the secret key if the import fails due to the
module not being found, since other failure reasons suggest incorrect
configuration or data corruption, and so require administrator attention.
@SpecLad SpecLad force-pushed the atomic-secret-key branch 2 times, most recently from f5a1d32 to dc999b3 Compare September 5, 2023 16:15
Copy link
Contributor

@nmanovic nmanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nmanovic nmanovic merged commit ab7c663 into cvat-ai:develop Sep 5, 2023
33 checks passed
@Keramblock
Copy link
Contributor

Hi, @SpecLad @nmanovic , I am not sure, but it is look like that init destoyed new instances for helm chart:
pic

@SpecLad
Copy link
Contributor Author

SpecLad commented Sep 6, 2023

@Keramblock Thanks for the report. What filesystem are you using?

@SpecLad SpecLad deleted the atomic-secret-key branch September 6, 2023 13:36
@Keramblock
Copy link
Contributor

Keramblock commented Sep 6, 2023

@Keramblock Thanks for the report. What filesystem are you using?

@SpecLad that is default azure RWX volume:
изображение
изображение

here is pvc spec:
изображение
storage class specs:
изображение

@SpecLad
Copy link
Contributor Author

SpecLad commented Sep 6, 2023

Hrm. Looks like Azure Files doesn't support hard links when mounted over SMB. Could you try mounting over NFS instead?

@Keramblock
Copy link
Contributor

Keramblock commented Sep 6, 2023

We dont have NFS server there(and I am not sure if Azure provides production ready NFS storageclass for k8s at all),but is it really necessary to use hardlink here?

P.S. for now I just copied files from other server and it is working fine
P.P.S. we can move to NFS if necessary, but I think that will severely limit you customers if they will try to deploy via helm in managed k8s.

@azhavoro azhavoro mentioned this pull request Sep 6, 2023
nmanovic added a commit that referenced this pull request Sep 6, 2023
### Added

- Gamma correcton filter (<#6771>)
- Introduced the feature to hide or show objects in review mode (<#6808>)

### Changed

- \[Helm\] Database migrations are now executed as a separate job,
  rather than in the server pod, to mitigate the risk of data
  corruption when using multiple server replicas
  (<#6780>)
- Clicking multiple times on icons in the left
  sidebar now toggles the corresponding popovers open and closed
  (<#6817>)
- Transitioned to using KeyDB with FLASH for data
  chunk caching, replacing diskcache (<#6773>)

### Removed

- Removed outdated use of hostnames when accessing Git, OpenCV, or analytics via the UI (<#6799>)
- Removed the Feedback/Share component (<#6805>)

### Fixed

- Resolved the issue of the canvas zooming while scrolling
  through the comments list in an issue (<#6758>)
- Addressed the bug that allowed for multiple issue
  creations upon initial submission (<#6758>)
- Fixed the issue of running deep learning models on
  non-JPEG compressed TIFF images (<#6789>)
- Adjusted padding on the tasks, projects, and models pages (<#6778>)
- Corrected hotkey handlers to avoid overriding default behavior when modal windows are open
  (<#6800>)
- Resolved the need to move the mouse to activate
  brush or eraser effects; a single click is now sufficient (<#6800>)
- Fixed a memory leak issue in the logging system (<#6804>)
- Addressed a race condition that occurred during the initial creation of `secret_key.py`
  (<#6775>)
- Eliminated duplicate log entries generated by the CVAT server
  (<#6766>)
mikhail-treskin pushed a commit to retailnext/cvat that referenced this pull request Oct 25, 2023
Currently, it's possible (even if unlikely) for multiple backend
processes to create and use different Django `SECRET_KEY` values,
because the following scenario can happen:

* process 1: fail to import `secret_key.py`
* process 2: fail to import `secret_key.py`
* process 1: generate a new `secret_key.py`
* process 1: import `secret_key.py`
* process 2: generate a new `secret_key.py`
* process 2: import `secret_key.py`

Fix this by making it so that `secret_key.py` is created atomically, and
never overwritten if it already exists.

In addition, only generate the secret key if the import fails due to the
module not being found, since other failure reasons suggest incorrect
configuration or data corruption, and so require administrator
attention.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants