Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect time measurements in unpublish workflow prevents unpublishing #652

Closed
thomashoneyman opened this issue Aug 4, 2023 · 5 comments · Fixed by #656
Closed

Incorrect time measurements in unpublish workflow prevents unpublishing #652

thomashoneyman opened this issue Aug 4, 2023 · 5 comments · Fixed by #656
Labels
alpha bug Something isn't working help wanted Extra attention is needed

Comments

@thomashoneyman
Copy link
Member

As seen in:
purescript/registry#346 (comment)

...a package published just minutes prior cannot be unpublished because, and I quote:

Cannot unpublish web-fetch@4.0.0 because it was published 6063.176484722222 hours ago. Packages can only be unpublished for 48 hours after publication.

This error arises here:

now <- nowUTC
published <- case Operation.Validation.validateUnpublish now payload.version metadata of
Left NotPublished ->
Except.throw $ "Cannot unpublish " <> formatted <> " because this version has not been published."
Left AlreadyUnpublished ->
Except.throw $ "Cannot unpublish " <> formatted <> " because it has already been unpublished."
Left InternalError -> do
Log.error $ formatted <> " is listed as both published and unpublished, which should be impossible!"
Except.throw $ "Cannot unpublish " <> formatted <> " due to an internal error."
Left (PastTimeLimit { difference, limit }) ->
Except.throw $ Array.fold
[ "Cannot unpublish " <> formatted <> " because it was published " <> Number.Format.toString (unwrap difference) <> " hours ago. "
, "Packages can only be unpublished for " <> Number.Format.toString (unwrap limit) <> " hours after publication."
]

But the real implementation is deeper, in the registry library itself:

-- | Verifies that a package version is eligible for unpublishing.
-- | The version must have been published before, must not have been unpublished before, and unpublishing
-- | must happen within the 48 hours following publishing.
validateUnpublish :: DateTime -> Version -> Metadata -> Either UnpublishError PublishedMetadata
validateUnpublish now version (Metadata metadata) = do
let
inPublished = Map.lookup version metadata.published
inUnpublished = Map.lookup version metadata.unpublished
case inPublished, inUnpublished of
Nothing, Nothing ->
Left NotPublished
Just published, Nothing -> do
-- We only pass through the case where the user is unpublishing a
-- package that has been published and not yet unpublished.
let diff = DateTime.diff now published.publishedTime
if (diff > hourLimit) then
Left $ PastTimeLimit { limit: hourLimit, difference: diff }
else
Right published
Nothing, Just _ ->
Left AlreadyUnpublished
Just _, Just _ ->
Left InternalError
where
hourLimit = Hours 48.0

We're comparing the current time in the UTC time zone to the published time associated with the package, which is calculated here:

let
getRefTime = do
timestamp <- Except.rethrow =<< Run.liftAff (Git.gitCLI [ "log", "-1", "--date=iso8601-strict", "--format=%cd", ref ] (Just repoDir))
jsDate <- Run.liftEffect $ JSDate.parse timestamp
dateTime <- Except.note "Failed to convert JSDate to DateTime" $ JSDate.toDateTime jsDate
pure dateTime
-- Cloning will result in the `repo` name as the directory name
publishedTime <- Except.runExcept getRefTime >>= case _ of
Left error -> do
Log.error $ "Failed to get published time: " <> error
Except.throw $ "Cloned repository " <> owner <> "/" <> repo <> " at ref " <> ref <> ", but could not read the published time from the ref."
Right value -> pure value
pure { path: repoDir, published: publishedTime }

The problem in this specific case:

  1. A commit to the repository was made in 2022
  2. The tag associated with that commit wasn't made until today
  3. We published the package, looked up the ref time, and voila: we see 2022 and assign that as the 'published time'
  4. Now the publish time is long in the past, and you can't unpublish.

The reason why we use the ref time as the publish time is to support the legacy importer, in which we need to associate old publish times so the registry doesn't say everything was published in August 2023. But we probably should be using the current time in the publish pipeline instead.

@thomashoneyman thomashoneyman added bug Something isn't working help wanted Extra attention is needed alpha labels Aug 4, 2023
@f-f
Copy link
Member

f-f commented Aug 9, 2023

But we probably should be using the current time in the publish pipeline instead

I think there's value in preserving timestamps of old packages. We can take care of this issue by:

  • distinguishing when something is coming from the legacy importer vs the new pipeline (in which it's fair to take the current time)
  • allowing trustees to unpublish without any time limits - we'll need to enable this anyways for DMCA reasons.

@thomashoneyman
Copy link
Member Author

Good point on the second note there, and I agree with you we ought to preserve time stamps for old packages

pete-murphy added a commit to pete-murphy/registry-dev that referenced this issue Aug 18, 2023
@pete-murphy
Copy link
Contributor

pete-murphy commented Aug 18, 2023

  • distinguishing when something is coming from the legacy importer vs the new pipeline

Would this just amount to distinguishing between when this handle is called from the LegacyImporter module vs elsewhere? Effectively something like this diff here (15d747f) or would it be more involved than that?

pete-murphy added a commit to pete-murphy/registry-dev that referenced this issue Aug 19, 2023
pete-murphy added a commit to pete-murphy/registry-dev that referenced this issue Aug 19, 2023
pete-murphy added a commit to pete-murphy/registry-dev that referenced this issue Aug 19, 2023
@thomashoneyman
Copy link
Member Author

I think you're right that it's basically choosing between two scenarios: one in which this is an import of a recent package initiated by a user's action, and one in which this is an automatic import of a legacy package. In fact, we already have a type for the distinction between the two:

-- | Operations can be exercised for old, pre-registry packages, or for packages
-- | which are on the 0.15 compiler series. If a true legacy package is uploaded
-- | then we do not require compilation to succeed and we don't publish docs.
data Source = Legacy | Current

The easiest thing is probably to move this data type into Prelude.purs and then have the fetch function take it as an argument:

fetch :: forall r. FilePath -> Location -> String -> Run (SOURCE + EXCEPT String + r) FetchedSource
fetch destination location ref = Except.rethrow =<< Run.lift _source (Fetch destination location ref identity)

@thomashoneyman
Copy link
Member Author

(The comment on the Source data type isn't quite right, because with #255 we'll require all packages to compile; no exemption of "legacy" packages. But the idea is the same: in some ways we treat legacy packages differently than 'current' ones.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alpha bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants