Fixing Yeoman's file system #658

SBoudrias · 2014-09-21T23:26:02Z

Fixing Yeoman's file system

Bringing the composability feature in yeoman-generator 0.17 really uncovered a huge design mistake we have on our file system helpers. Write are done in a synchronous way, but really they are asynchronous.

This issue mean multiple generators cannot easily edit the same files content. This also mean multiple writes to the same file will trigger multiple conflicts check on content that was added by multiple generators.

Current state

Legacy file system

The legacy file system is a pretty rough implementation of file system helpers (this.write(), this.copy()).

The good

They do the job. Behind the scene they're aware of source and destination directories.

The bad

The source/destination logic is hidden away.
They're tightly coupled to the generator system (not external dependencies)
Writes are async, and there's no way to handle them as such.

File-utils

The newest system is file-utils - which is basically a fork of Grunt-file.

The good

Source/Destination aware (and that logic is exposed)
Allow write-filters to normalize file, convert types, etc.

the bad

The Source/Destination logic can be confusing
Write are async because of the conflicter

Take away

The things I take away from our current state is:

We need a context aware system (source/destination)
The Conflicter is a core part and need to stay (users need to know when we're going to override their file)
Multiple generators must be able to read/write to the same file
Write filters are important to our end users and it's a feature we've been asked a lot.

Where to go from there?

So I've been thinking about this issue for a while, and I think the way forward is to abstract our file-system handling to an in-memory state of the file-system where generators can read/write to. And once, at the end of the generation process, we commit the memory file system to the file drive passing every file through a series of write filters then to our conflicter.

Some questions still remains:

Should we hide the source/destination logic from the end user?

Are we better with an API that hide the logic away like the legacy system or should we make it very obvious:

fs.copy(this.src('_package.json'), this.dest('package.json'))

(That's an example, we would refine the API)

What do we do with large FS needs?

Right now we have this.bulkCopy(), does it makes sense to keep this part of the core system? If not, what is the way forward when someone wants to copy over the 10K files from Wordpress?

Would it makes sense/help performance to use streams/RXjs?

I'm not sure about that because really the time is spent on the conflicter needing user input.

I'm raising this question because it was a major point of the yeoman.next document.

Etc, etc

discussion is open

The text was updated successfully, but these errors were encountered:

sindresorhus · 2014-09-23T17:00:47Z

The Source/Destination logic can be confusing

How can they be confusing?

So I've been thinking about this issue for a while, and I think the way forward is to abstract our file-system handling to an in-memory state of the file-system where generators can read/write to. And once, at the end of the generation process, we commit the memory file system to the file drive passing every file through a series of write filters then to our conflicter.

YES! There's an old issue somewhere where I propose something similar.

I think we should adopt the vinyl format for representing in-memory files.

Should we hide the source/destination logic from the end user?

Yes, they shouldn't have to care about where the source and destination is. But it should be easy to get the information about it when needed.

What do we do with large FS needs?

Maybe some kind of filter for what files to process. That way we could have only one method for copying.

Would it makes sense/help performance to use streams/RXjs?

For copying without processing we should use streams as it's much faster. For processing we could still use streams, but buffer it up in memory.

If anyone has any gripes about this system. Now is the time to bring it up. This especially pertains to generator authors.

kevva · 2014-09-23T17:15:21Z

I think we should adopt the vinyl format for representing in-memory files.

+1. I'd just use vinyl-fs and write some methods around that to cover our needs.

sindresorhus · 2014-09-23T17:41:47Z

// @contra @phated

sindresorhus · 2014-09-23T17:42:10Z

@yeoman/yeoman-core

phated · 2014-09-23T17:50:10Z

👍 for vinyl, it is a great concept and the module ecosystem is pretty good.

yocontra · 2014-09-23T18:23:24Z

Also 👍 on using vinyl. Not sure how the "conflicter" works in yeoman (haven't used it before) but this might be of interest for only replacing non-existing or newer files https://github.com/Marak/vinyl-diff

SBoudrias · 2014-09-23T20:02:25Z

I'm not sure I see any advantage for us to use vinyl streams. I feel like most action a generator take are single file based, we don't do batch processing like a build system would.

Any example of how this would apply in the context of a generator system?

(I'm all in for vinyl format, my concern is about vinyl-streams)

SBoudrias · 2014-10-12T00:42:02Z

Ok, so I've been pondering on this idea. And here's how I see the API:

In Memory FS

This will be one module. And instance will be globally shared on the environment object so multiple generators can access it.

It will basically give access to 3 functionality: read, write, commit.

// Basic operations
memFs.read('file.js'); // return a vinyl file
memFs.write(vinylFile);

// When ready to write to disk
memFs.commit();

The read action will either return a vinyl file it loaded previously, or create a new one - either from disk or an empty vinyl file.

The commit action will take every file in memory and apply their state to the disk. Each vinyl file can have three state: "idle", "write", "delete". Idle are ignored, write are written and delete are removed from disk.

Once a file action is committed to disk, it is removed from the memory FS. The way it does it is IMO an afterthought and could be anything (I'm alright with using streams here).

One thing I'm unsure is how we're going to handle conflicts at the commit phase. I was thinking of maybe allowing a way to fork a current memory FS so we can determine which one can be written and which one need user validation. I'm still thinking about this one. (And we might just end up removing the commit phase from this module)

This module will be used by the system internally. Generator authors won't need to understand or touch it.

Edition commands

These will be utilities to transform the in memory FS. They're going to be generator specific (so that the versioning can change more easily). These are going to be called by the generator author (so they are the public API).

I guess we know what usual interface we want on this end (write, copy, etc). We'll also be able to retrofit most of our current file helpers to use the memory FS system so even old generators will be able to benefit from this update!

Feedback

That's your turn :)

SBoudrias · 2014-10-16T16:32:12Z

After reconsideration, the store won't know how to commit itself to disk. It'll only provide iteration helpers (maybe a fs.each()), and another module completely will be in charge of committing it to disk (probably via a stream or a reactive interface).

The neat thing about passing it through a stream, is that we could register and reuse any existing gulp plugins (like file beautifier, style standardization, pre-compilation) to process the project to the end user taste. The only "bad" point is that streams are not super flexible and concurrent compared to Rx.js - but I think the leverage we can get from reusing gulp plugins is worth it.

SBoudrias · 2014-10-17T15:39:51Z

FWIW, started prototyping on this idea:

SBoudrias · 2014-10-17T16:27:32Z

An issue I see we'll have using gulp plugins is that most of them expect a specific type to be passed in. They usually don't check the type and decide whether or not to pass it through.

In the case of yeoman, we'd just stream all the file and we'd expect plugins to either pass through or apply their effect if they support the file type.

Do you think we could just fix gulp-plugins to work this way or is there really an incompatible mind set?

(cc @contra)

yocontra · 2014-10-17T21:23:27Z

@SBoudrias Incompatible mindset, it's up to the user to decide which files to pass where in gulp. If you pipe a folder with 10 types of files to a pipeline with 10 plugins in it, you need to use gulp-if to only pass *.coffee to coffee and so on

SBoudrias · 2014-11-14T07:16:48Z

Now part of 0.18.0!

hemanth · 2014-11-14T10:33:55Z

Uber kool! @SBoudrias++

SBoudrias added to-discuss hard feature labels Sep 21, 2014

SBoudrias mentioned this issue Sep 23, 2014

Dependencies injection for sub-generators #662

Closed

SBoudrias mentioned this issue Oct 17, 2014

[core] Normalizing files whitespace/styleguide yeoman/Hackathons#5

Open

SBoudrias mentioned this issue Oct 17, 2014

Yeoman Toronto Hackathon 2014 yeoman/Hackathons#12

Closed

This was referenced Oct 18, 2014

actions#file methods should support globbing #175

Closed

Expose mem-fs on the environment yeoman/environment#10

Merged

Fix yeoman file system #679

Merged

Better way to modify files. #137

Closed

This was referenced Oct 29, 2014

Add path resolution helpers #687

Closed

Add transform stream filters #691

Closed

SBoudrias mentioned this issue Nov 13, 2014

[WIP] 0.18.0 Changelog #699

Closed

SBoudrias closed this as completed Nov 14, 2014

Toddses mentioned this issue Jan 27, 2015

fixes issue with a null conflict while installing wordpress wesleytodd/YeoPress#133

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing Yeoman's file system #658

Fixing Yeoman's file system #658

SBoudrias commented Sep 21, 2014

sindresorhus commented Sep 23, 2014

kevva commented Sep 23, 2014

sindresorhus commented Sep 23, 2014

sindresorhus commented Sep 23, 2014

phated commented Sep 23, 2014

yocontra commented Sep 23, 2014

SBoudrias commented Sep 23, 2014

SBoudrias commented Oct 12, 2014

SBoudrias commented Oct 16, 2014

SBoudrias commented Oct 17, 2014

SBoudrias commented Oct 17, 2014

yocontra commented Oct 17, 2014

SBoudrias commented Nov 14, 2014

hemanth commented Nov 14, 2014

Fixing Yeoman's file system #658

Fixing Yeoman's file system #658

Comments

SBoudrias commented Sep 21, 2014

Fixing Yeoman's file system

Current state

Legacy file system

The good

The bad

File-utils

The good

the bad

Take away

Where to go from there?

Should we hide the source/destination logic from the end user?

What do we do with large FS needs?

Would it makes sense/help performance to use streams/RXjs?

Etc, etc

sindresorhus commented Sep 23, 2014

kevva commented Sep 23, 2014

sindresorhus commented Sep 23, 2014

sindresorhus commented Sep 23, 2014

phated commented Sep 23, 2014

yocontra commented Sep 23, 2014

SBoudrias commented Sep 23, 2014

SBoudrias commented Oct 12, 2014

In Memory FS

Edition commands

Feedback

SBoudrias commented Oct 16, 2014

SBoudrias commented Oct 17, 2014

SBoudrias commented Oct 17, 2014

yocontra commented Oct 17, 2014

SBoudrias commented Nov 14, 2014

hemanth commented Nov 14, 2014