Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate cases in which cache is stale #1184

Closed
ghost opened this issue Jan 13, 2017 · 33 comments
Closed

Investigate cases in which cache is stale #1184

ghost opened this issue Jan 13, 2017 · 33 comments
Labels
bug Unexpected and reproducible misbehavior.

Comments

@ghost
Copy link

ghost commented Jan 13, 2017

We have just updated to 0.16 in our codebase and I'm working through the new rules. First up it's sorted_imports. For this, we have a build phase which runs ${PODS_ROOT}/SwiftLint/swiftlint as per the guidelines in the README file. I've worked through part of our codebase (it takes a while as we have well over a hundred thousand lines of Swift code) and sorted the imports. I then re-ran the linter, to clear out the issues I'd already fixed and make it easier to keep track of my progress.

At this point, the linter reported the remaining issues, but also reported several files I'd already fixed as having the issue. Manual inspection confirmed that the imports had already been sorted correctly.

I ignored the issue for the time being and continued through the real issues, eventually reaching the end. Rerunning left only the false positives. No matter how many times I re-ran the linter, it would keep these false positives.

However, changing our script in the build phase to ${PODS_ROOT}/SwiftLint/swiftlint line --no-cache resulted in the linter finding lots of files I hadn't got to, and removed all the false positives.

I have no idea what's going on behind the scenes, but the new cache added in 0.16 seems to be broken.

@jpsim
Copy link
Collaborator

jpsim commented Jan 13, 2017

It's quite possible that the approach used to determine if a file has changed is insufficient. We're currently hashing the file, so maybe the hash isn't affected by a re-ordering of lines?

@jpsim jpsim added the bug Unexpected and reproducible misbehavior. label Jan 13, 2017
@jpsim jpsim changed the title Cache seems to be broken Reshuffling the same characters in a file doesn't update the cache Jan 13, 2017
@marcelofabri
Copy link
Collaborator

It seems that the hash changes when re-ordering lines:

  1> let str1 = "import Foo\nimport Bar" 
str1: String = "import Foo\nimport Bar"
  2> let str2 = "import Bar\nimport Foo" 
str2: String = "import Bar\nimport Foo"
  3> str1.hashValue
$R0: Int = -300990755179970577
  4> str2.hashValue
$R1: Int = 7298368255116804847

@ghost
Copy link
Author

ghost commented Jan 19, 2017

Any further progress on this one? We are waiting on 0.17 for a few fixes, and while we could go ahead with it currently, we would need to always disable the cache due to these problems. Obviously that's undesirable for everyone.

@marcelofabri
Copy link
Collaborator

Not that I'm aware of, but feel free to investigate it deeper! A repro case would be very welcome 😉

@jpsim
Copy link
Collaborator

jpsim commented Jan 21, 2017

It seems that the hash changes when re-ordering lines

Sorry, it was just a guess. I'll update the issue title to not be so misleading 😉

@jpsim jpsim changed the title Reshuffling the same characters in a file doesn't update the cache Investigate cases in which cache is stale Jan 21, 2017
@norio-nomura
Copy link
Collaborator

norio-nomura commented Jan 24, 2017

It is a mistake to use hash to detect changes in String, because hash does not use the entire contents of String.

An example of hash collision:

import Foundation

let first32Char = repeatElement("f", count: 32)
let middle32Char = repeatElement("m", count: 32)
let last32Char = repeatElement("l", count: 32)
let a = (first32Char + ["ab"] + middle32Char + ["ab"] + last32Char).joined()
let b = (first32Char + ["bc"] + middle32Char + ["bc"] + last32Char).joined()
a == b // false
// Foundation
a.hash // -4528083392938427260
b.hash // -4528083392938427260
// Swift.Hashable
a.hashValue // -8957072229412966235
b.hashValue // -8957072229412966235

@jpsim
Copy link
Collaborator

jpsim commented Jan 24, 2017

Maybe we should consider the last modified time of the file instead

@marcelofabri
Copy link
Collaborator

What about using MD5 as hash?

@jpsim
Copy link
Collaborator

jpsim commented Jan 24, 2017

I think MD5 or SipHash would lead to a more reliable result, but I still think last modified time might be most reliable. It's what tools like rsync use to determine whether or not a file has changed.

@phoney
Copy link

phoney commented Jan 25, 2017

I ran into this with the 0.16.0 version on the type_name rule. I had a type configurableObject that was lowerCamelCase. I fixed it in a number of files to be upperCamedlCase but those errors continued to be reported.

I believe that most tools use the modification time for this. Make sure to do a != comparison rather than a > comparison between the saved mod time and the one on the file. There are oddball cases where the clock can be reset or a file can come from another computer and the dates are in different time zones where the != comparison works but the > comparison doesn't.

Is there a way to delete the cache to work around this issue?

@marcelofabri
Copy link
Collaborator

@phoney You can run swiftlint lint --no-cache

@benasher44
Copy link
Contributor

benasher44 commented Jan 26, 2017

While working on #1191, I found that the current caching behavior might not be reliable across runs of swiftlint when there are multiple paths present. Currently, the path to the cache is determined based on one of the following, if the cache_path config option is not provided:

  1. The full path to the folder swiftlint is operating on (or the current working directory)
  2. The full path to the path passed to --path

I'm not sure what happens (in terms of automatic caching) when swiftlint is used from Xcode (i.e. multiple paths are passed to lint via environment variables). Maybe someone can help me out here?

In any case, the fallback logic used to determine a default cache path (based on the above numbered options I mentioned) falls apart a bit when allowing people to pass multiple paths to swiftlint as arguments. You could hash all of the paths passed to swiftlint combined, but if you omit a single path, then you'll bust the cache using that strategy (assuming you're meaning to lint the same project minus one path). Given this, my current thought is to remove automatic caching, and I have a few thoughts about why:

  1. I expect a CLI command like swiftlint lint to behave the same across multiple invocations. This isn't the case right now, since it has built-in (i.e. enabled by default / opt-out) automatic (determine cache key automatically) caching (after the first invocation, it leans on the cache).
  2. As mentioned earlier, the automatic caching logic seems to fall apart when passing multiple paths. At least right now, I can't think of any great ways to make this better that easily re-uses the same cache for the same project across multiple runs when passing multiple paths (assuming you might not pass the same paths each time but want to re-use the cache for your project).
  3. Caching is easy to enable in the configuration, and this feature is probably most useful for longer term projects where you'll likely be leaning heavily on a config file anyway.
  4. Because of 3, most CLI usages of swiftlint that don't use a config file probably don't benefit from caching, so it seems a bit wasteful to leave on by default.

I'd appreciate any comments filling in knowledge gaps here. Thanks!

@benasher44
Copy link
Contributor

Made a PR for this (#1267) to show what this entails and to push this discussion forward a bit :)

@victorpimentel
Copy link
Contributor

I have tackled this by trying to fixing the cache, not by disabling it 😄 The change is a bit more intrusive that I had hoped, but in my tests it works well and the performance is really good.

Can someone test it in their own projects (backuping up first 🤓 )? Thanks!

@benasher44
Copy link
Contributor

benasher44 commented Feb 2, 2017

To clarify, I was told I could use this issue as an umbrella for general caching issues (i.e. disabling automatic caching won't solve the original issue in the description/title). I don't mean to suggest that caching should be disabled all together. I just think you should have to opt into it.

@marcelofabri
Copy link
Collaborator

marcelofabri commented Feb 2, 2017

I have mixed feelings about turning it opt-in. Maybe we can do that until we're sure that we don't have any more issues with it.

Just as a comparison, rubocop enables caching by default.

Another thing to have in mind (but I don't know if it's a real issue) is that when building from master, you get the same version number as the latest stable version, so cache isn't invalidated.

@benasher44
Copy link
Contributor

Hm that's a good point. Maybe the automatic caching should only be project-oriented? What I mean is that maybe it only makes sense if you're linting a whole folder or have provided a cache path? This basically goes back to disabling caching when specifying multiple specific files via CLI arguments, but it would be enabled if you pass a single folder (or nothing to lint the current directory).

@marcelofabri
Copy link
Collaborator

Yeah, I think that's a good idea.

@victorpimentel
Copy link
Contributor

I'd leave this as a small learning warning for future references. Never use Foundation hashes for equality:

let cats: NSArray = ["Garfield", "Grumpy Cat", "Doraemon"]
let dogs: NSArray = ["Scooby-Doo", "Snoopy", "Pluto"]

cats.hash // 3
dogs.hash // 3

http://swiftlang.ng.bluemix.net/#/repl/58985cbae69a503cdf5221b9

@Greensource
Copy link

The cache file seem to be located into ~/Library/Application Support/SwiftLint.
Isn't better if it will be located into derived data for example?
Because today, when you clean or remove derivedData you think you start from scratch but it's not true for Swiftlint.

@jpsim
Copy link
Collaborator

jpsim commented Feb 15, 2017

In retrospect, Application Support is definitely the wrong place to be storing the cache.

In my work-in-progress branch to fix fundamental issues with our caching strategy (jp-modified-time), I've moved the cache location to the caches directory (ee9ce4a). In whatever shape the caching fixes end up in, I hope that's a change we can sneak in.

@victorpimentel
Copy link
Contributor

Ok, I did another pull request with the proposed fix, I hope we can merge that back soon 😄

@Mercieral
Copy link

On a somewhat related note while we wait for the PR, how do I specify the linter to ignore cache with "--no-cache" in the xCode build phase? It gives me a Shell Script Invovation Error line 2: /Users/Aaron/project/Pods/SwiftLint/swiftlint --no-cache: No such file or directory when trying to use a build phase script of "${SRCROOT}/Pods/SwiftLint/swiftlint --no-cache"

@marcelofabri
Copy link
Collaborator

@Mercieral You need to use the lint command explicitly to use any options: swiftlint lint --no-cache

@Mercieral
Copy link

@marcelofabri Sorry I should have mentioned that I tried both. So using "${SRCROOT}/Pods/SwiftLint/swiftlint lint --no-cache" gives me /Users/Aaron/project/Pods/SwiftLint/swiftlint lint --no-cache: No such file or directory

@victorpimentel
Copy link
Contributor

@Mercieral It seems that you don't have the swiftlint pod installed. You need to follow these instructions to integrate it with your projects:

https://github.com/realm/SwiftLint#using-cocoapods

Maybe you just need to do a pod install...

@Mercieral
Copy link

@victorpimentel the swiftlint pod is installed using the newest version of cocoapods (just checked yesterday). The problem is that I can run "${SRCROOT}/Pods/SwiftLint/swiftlint" in the xcode build phase script just fine and it works, and I can run "Pods/SwiftLint/swiftlint lint --no-cache" just fine from the command line, but running "${SRCROOT}/Pods/SwiftLint/swiftlint lint --no-cache" in the build script causes xcode not to be happy. Am I the only one this happens to?

@marcelofabri
Copy link
Collaborator

@Mercieral can you post your whole script here?

@Mercieral
Copy link

I found the issue, simple fact that I thought that build scripts needed to be in quotes...
I can remove my comments in this issue so it doesn't pollute the discussion of the actual issue if you want.

@ZevEisenberg
Copy link
Contributor

Off-topic, but I would still use quotes like this to guard against paths with spaces in them:

"${SRCROOT}/Pods/SwiftLint/swiftlint" lint --no-cache

@jpsim jpsim mentioned this issue May 15, 2017
@marcelofabri
Copy link
Collaborator

Closed by #1530

@acecilia
Copy link
Contributor

acecilia commented Dec 28, 2021

Hi 👋 I have some questions about cache that cant find answers for. Found this discussion, maybe you guys can help me out 🙏 :

  • Why is there a --no-cache parameter? I was assuming if somebody does not want to use cache, then not passing the --cache-path parameter is enough.
    EDIT: because the cache is being used by default, the --cache-path only changes the path where it gets generated
  • Is there any way to remove the swiftlint cache that does not involve manually deleting the corresponding folder? Something like swiftlint cache delete?

@jpsim
Copy link
Collaborator

jpsim commented Jan 21, 2022

  • Is there any way to remove the swiftlint cache that does not involve manually deleting the corresponding folder? Something like swiftlint cache delete?

No there's no way to clear the cache via SwiftLint. As far as I know, SwiftLint's caching mechanism has been robust for several years now, so there's no compelling reason to expose a way to clear the cache for end users.

However, disabling the cache is extremely useful when working on SwiftLint itself, since SwiftLint's version number is used to assume that if the version number doesn't change, SwiftLint's behavior hasn't changed, which is often not the case when making changes to SwiftLint itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected and reproducible misbehavior.
Projects
None yet
Development

No branches or pull requests

10 participants