Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #499

Closed
wants to merge 3 commits into from
Closed

Update README.md #499

wants to merge 3 commits into from

Conversation

david-ragazzi
Copy link
Contributor

Note that I eliminated env variables because this makes the understanding easier for newbies and avoid create several variables for folders related each another (usually in same source root).

Note that I eliminated env variables because this makes the understanding easier for newbies and avoid create several variables for folders related each another.

Generate build files:

mkdir (source)/build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we explicitly defined what (source) means here? Or maybe we can add a line above that specifies that all commands are being run from the nupic checkout directory, and assume that's where we are. Then we don't even have to use a reference to the source code dir and make all the commands relative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right Matt, (source) [or maybe (source dir)] is the nupic checkout directory itself (ex: /Volumes/DavidRagazzi/Desktop/nupic-master), not a literal value. I'm avoiding use env variables because this avoid a newbie have to read CMakeLists or shell scripts to find definitions. Your idea is good, we could add a line saying that (source dir) is the current location where user downloaded the repository.

PS: You could choose other name to (source) such as (source dir), (repo dir), (source folder), etc, and other symbols to embraces it ('[', '{', etc). (source) is the first name that come in my mind because it is extremely simple.. hehe Maybe "source dir" is better, "source" is a bit generic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Davidragazzi Are you suggesting they create a build directory inside the source repository? Currently we don't recommend that. Could it be (source)/../build?

@david-ragazzi
Copy link
Contributor Author

@subutai
Actually "(source dir)/build" is not a build directory inside the source repository (although all indicates, due to the choice of this name have been unhappy). It is the equivalent to "build_system" folder. I end up using this name due to build_system already exists. But to say true, 'build_system' is more appropriate, 'build' was a mistake (I was misled by CMake.. hehe)..

The 'build' folder I believe you mean would be (source dir)/bin (i.e. $HOME/nta/eng) which is the folder where generated binaries, libraries, and python tests are put on.

In summary, I included these folder changes in Readme file, but please forget:

  • "$HOME/nta/eng" becomes "(source dir)/bin" which follow practices of many systems of put binaries in a 'bin' subfolder.
  • "(source dir)/build_system" continues with the same function except that it will contain either autotools stuff or specified IDE (note that generated IDE by Cmake re-use the source files, i.e. it doesnt duplicate source and header files in "build_system" folder).

Well, from my experience, using a single folder to group these files categories is easier (today for example, I have to open 3 'Finders' for each folder: repository folder, $HOME/nta/eng and generated IDE folder).

But, this is only a suggestion. An another structure I have seen in some repositories would be:

  1. Nupic
  • 1.1. Sources (folder equivalent to current repository, except that readme, lincence, etc will be in a level up)
  • 1.2. Release (folder equivalent to current $HOME/nta/eng)
  • 1.2. Build_System (folder equivalent to $NUPIC/build_system with autotools or IDE stuff).
  • Readme, Licence and other docs.

I believe the latter one is more clean in addiction to allow better flexibility to user. Furthermore, in the future, when you think of releases, the structure already will be ready (i.e. users will can choose either source+binaries, or only binaries)!

But again, this is only a suggestion. I'm a bit addicted to organization and architecture, so forgive me if I'm being annoying suggesting these things .. hehe


#### Generate build files:

mkdir (source)/build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about keeping the existing $NUPIC instead of the (source)? My concern is: current users are used to it, and all existing documentation (mails, wiki,..) is refering to it. Also $XX makes sense from the shell point of view, while (source) does not. Just my 2c.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern.

But if we continue to use env variables, we have 2 problems:

  • Portability: We remain dependent of shell, which is not portable. To say truth, I really dont like Windows, but I recognise that most systems are based on it (unfortunately the curse of 90% of market share remains true), in addition to some developers love Visual Studio that works only on it.
  • Complexity not so useful: We will have to create a shell file setting variables only for 4 or 5 shell commands. CMake creates env variables but only for runtime purpose, i.e. python files know where are swig modules and the python path.

BUT...........
we could simply add more 1 line such as:
export NUPIC=path/to/source

This way we would have:
export NUPIC = /path/to/source
mkdir $NUPIC/build_system
cd ($NUPIC/build_system
cmake $NUPIC

However, if we want Windows in a soon future, we would have port these same instructions to:
set NUPIC /path/to/source
mkdir %NUPIC%/build_system
cd (%NUPIC%/build_system
cmake %NUPIC%

In my opinion, this would increase complexity. I really dont know if the price is fair.. :-(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I just meant to use the term "$NUPIC" (which we are used to) instead of "(source)" in the readme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah.. ok..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update:
$NUPIC and $NTA will remain, but they will be created by CMake, not shell scripts. After creation, users will can change $NTA and reference to where they want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear... if I don't have cmake installed, will make work instead? Then $NUPIC and $NTA will get created in a make file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I don't have cmake installed, will make work instead? Then $NUPIC and $NTA will get created in a make file?

Actually $NUPIC and $NTA are created from own CMake script. There's a CMake command called "execute_process()" which we can call any shell command without need create .sh files for this purpose (because this I insisted in remove shell scripts as they are not portable [even for UNIX-like OS]).

Note that CMake doesnt build anything, it simply generates (nicely) rules scripts for the build software that you choose (not matter if Make, Visual Studio, Eclipse, etc). So the repository wont have any Makefile or Configure.ac (for our enjoy!), but they will be generated from CMake script ONLY IF the user want generate them from command line (Make is the default generator when CMake is called from shell, but one also can choose Ninja as build tool which builds faster).

So CMake is bounden but Make is optional.

I'm not a Travis expert, but I believe that Travis script should call first CMake command for generating the Make (or Ninja) files and then build the project from these generated scripts

Update:

A simple Travis script that I adapted from internet (http://stackoverflow.com/questions/13051880/is-there-an-example-project-in-c-that-uses-opencv-and-travis-ci):

env:
....global:
........- NUPIC=$TRAVIS_BUILD_DIR/source
........- NTA=$TRAVIS_BUILD_DIR/release

...

install:
....- if [ $PY_VERSION != "2.6" ]; then (cd nupic-linux64/ && mkdir -p lib/python${PY_VERSION}/site-packages && make > /dev/null) fi
....# Workaround for multiprocessing.Queue SemLock error from run_opf_bechmarks_test.
....# See: travis-ci/travis-cookbooks#155
....- "sudo rm -rf /dev/shm && sudo ln -s /run/shm /dev/shm"
....- pip install -q -r $NUPIC/external/common/requirements.txt
....- mkdir $TRAVIS_BUILD_DIR/build_system
....- cd $TRAVIS_BUILD_DIR/build_system
....- cmake $NUPIC
....- make

Note that this line:
....- "$NUPIC/build.sh"

was replaced by this:
....- mkdir $TRAVIS_BUILD_DIR/build_system
....- cd $TRAVIS_BUILD_DIR/build_system
....- cmake $NUPIC
....- make

Assuming that $TRAVIS_BUILD_DIR is the repository root folder (right?), it's just replace these texts with those correspondent on current Travis script.

@subutai
Copy link
Member

subutai commented Dec 16, 2013

@david-ragazzi Not sure I fully understand. Today we have three different directories:

Source tree, ie. $NUPIC. This is the git repository. We should keep this very clean and not put temporary files in here. We already have more temporary stuff than I would like.

Build directory. This is for all the temporary build files generated by the compiler, such as .o files, .a files, etc. Currently it's often put in /tmp/ntabuild, but it could be anywhere as long as it's not inside the source tree.

Install or release directory. This the output of the build process and contains all the compiled binaries and python source files you need to run NuPIC. This can be placed anywhere outside of the source tree. Inside this directory there is a bin, lib, share, etc. Today this is defined by $NTA. A developer could have multiple releases available, e.g. for python 2.6 and python 2.7 Similarly they could have a debug or release version available.

Can we stick to this structure? Thanks!

@breznak
Copy link
Member

breznak commented Dec 16, 2013

@subutai I think David wanted to easy it for the user with no need to specify/add to .bashrc,.., and avoid exporting the variables (BUILDDIR is interfering with some program on my system), also the default bin/ is quite standart for Makefiles.

I also see your points of flexibility for developers to have multiple versions behind $NTA, etc... So would keeping the default values as David suggests, but adding ability to override with specifying $NTA/NUPIC/... be a good compromise? That way, if you keep your existing .bashrc settings, things would not change.

@subutai
Copy link
Member

subutai commented Dec 16, 2013

@breznak The user will still need to add the appropriate directory to PYTHONPATH in their .bashrc right? They might also need to set LD_LIBRARY_PATH

I guess I'm ok with it defaulting to the current directory as long as it can be overridden, per your suggestion, but usually it is very bad practice to start creating temporary files in the same directory as your repository. Presumably we will also need a "make install" that would create the install or release directory to some location?

BTW, BUILDDIR is just a temporary name used within build.sh which is a convenience script - it's not used anywhere else by the build system AFAIK. It's purely inside that script so you can name it something else.

There are two more convenience variables that I find extremely useful:

MK_JOBS - number of compile jobs to run in parallel. Speeds up builds quite a bit.

NTAX_DEVELOPER_BUILD - this creates a bunch of symlinks from the install dir back to the source directory. Avoids doing a build step for most python file changes.

These are also optional, but super useful.

@david-ragazzi
Copy link
Contributor Author

@subutai

But this is the structure is that I suggested later.. :-) The only difference is the location of them. Just for compare:

  • $NUPIC becomes $REPOSITORY/Sources
  • $NTA becomes $REPOSITORY/Release
  • /tmp/builddir could continue the same
    and Build_System (IDE solution file or make files generated by CMake) would be put outside from /Sources (which could leave still cleaner the /Sources folder).

At first configuration, CMake would create default values to $NUPIC and $NTA based on current source dir which CMake was called. After that, user could feel free to change the release location just changing the $NTA value (thanks @breznak for this observation!! ).

Bellow a screenshot of this model:
screen shot 2013-12-16 at 9 35 49 pm

In the above case, $NTA is set as "~/Desktop/nupic-master/Release". But if some user want change to another location, it is just re-config $NTA.

Please, forget my first suggestion, it is based on projects without concern with releases, but the this last structure is highly intuitive.

@subutai
Copy link
Member

subutai commented Dec 16, 2013

@david-ragazzi Sorry, my mistake in understanding the various proposals. The screen shot is helpful and seems very clean. As long as we can put Releases elsewhere easily I'm ok with this. Also, to stick with our convention maybe we can make the names in lower case and singular, like "release" and "source".

@david-ragazzi
Copy link
Contributor Author

@subutai It's ok!

Just an observation:
$NTA_ROOTDIR (not $NTA) becomes $NUPIC/release (unless NTA_ROOTDIR calls are changed to $NTA in all source files that use this variable). Today $NTA = $NTA_ROOTDIR (rendundant variables).

@subutai
Copy link
Member

subutai commented Dec 17, 2013

Makes sense. When you have all this working it would be good to have @scottpurdy and @oxtopus review it as well. They are pretty familiar with the build system, much more than I am. What happens to docs, tests, examples, and conf? Would they just stay where they are today?

@david-ragazzi
Copy link
Contributor Author

@subutai

The user will still need to add the appropriate directory to PYTHONPATH in their .bashrc right? They might also need to set LD_LIBRARY_PATH"

Not, CMake makes the magic! I wrote code to set PYTHONPATH and LD_LIBRARY_PATH in an automatic way. Newbies neither need understand shell commands to use NUPIC.. :-)

I guess I'm ok with it defaulting to the current directory as long as it can be overridden, per your suggestion, but usually it is very bad practice to start creating temporary files in the same directory as your repository. Presumably we will also need a "make install" that would create the install or release directory to some location?"

My post (with screen shot about the tree) explained this.

BTW, BUILDDIR is just a temporary name used within build.sh which is a convenience script - it's not used anywhere else by the build system AFAIK. It's purely inside that script so you can name it something else."

We can leave as is, I dont believe that a normal developer has interest in analyse these files.

There are two more convenience variables that I find extremely useful:

MK_JOBS - number of compile jobs to run in parallel. Speeds up builds quite a bit.

NTAX_DEVELOPER_BUILD - this creates a bunch of symlinks from the install dir back to the source directory. Avoids doing a build step for most python file changes.

These are also optional, but super useful."

I'm not sure, but wouldn't be MK_JOBS used only by Autotools?

I will check NTAX_DEVELOPER_BUILD carefully and how implement it.

@subutai
Copy link
Member

subutai commented Dec 17, 2013

@david-ragazzi Thanks.

PYTHONPATH: with the new system, if the user starts a new shell, types "python" and then wants to import nupic into the interactive python, how do they do it?

MK_JOBS - this is just a convention we use. It is for passing the -j argument to make, as in "make -j $MK_JOBS". It tells make to run that many compile steps in parallel. How are parallel compile jobs handled in cmake? Can you specify how many at a time to run in parallel?

@david-ragazzi
Copy link
Contributor Author

@subutai

When you have all this working it would be good to have @scottpurdy and @oxtopus review it as well. They are pretty familiar with the build system, much more than I am.

Yes, sure.

What happens to docs, tests, examples, and conf? Would they just stay where they are today?

I believe yes (not sure). After all, they are source code, except "docs" (where is it in repo?) which could be moved to $REPOSITORY root dir. This is a practice in several open-source projects.

Trust me, after these changes, the life will be easier for all us.. :-)

@david-ragazzi
Copy link
Contributor Author

@subutai

PYTHONPATH: with the new system, if the user starts a new shell, types "python" and then wants to import nupic into the interactive python, how do they do it?

This a tricky question. In case of OSX, I wrote CMake code to set env variables on the higher scope, ie an env variable is visible both in shell and GUI apps. This can be good but also a problem in case of this variable name also is used by other applications. But as far I know, as PYTHONPATH is set on .bashrc and this makes it visible for all applications, I believe that we wont have major problems. So answering your question, I dont believe that behaviour in python interpreter will be changed.

MK_JOBS - this is just a convention we use. It is for passing the -j argument to make, as in "make -j $MK_JOBS". It tells make to run that many compile steps in parallel. How are parallel compile jobs handled in cmake? Can you specify how many at a time to run in parallel?

This should be easy. CMake should have some command line flag for passing to Make. I'll check.

@david-ragazzi
Copy link
Contributor Author

@subutai

These are the env vars that CMake code is setting at moment:
set_environment_variable(PATH "${CMAKE_RUNTIME_OUTPUT_DIRECTORY}" ON)
set_environment_variable(PYTHONPATH "${PYTHON_SITE_PACKAGES_DIR}" OFF)
set_environment_variable(NTA "${CMAKE_INSTALL_PREFIX}" OFF)
set_environment_variable(NTA_ROOTDIR "${CMAKE_INSTALL_PREFIX}" OFF)
set_environment_variable(NTA_DATA_PATH "${CMAKE_INSTALL_PREFIX}/share/prediction/data" OFF)
set_environment_variable(LD_LIBRARY_PATH "${CMAKE_ARCHIVE_OUTPUT_DIRECTORY}" ON)
set_environment_variable(DYLD_LIBRARY_PATH "${CMAKE_LIBRARY_OUTPUT_DIRECTORY}" ON)

Obs: ON/OFF means that current value should be incremented, not replaced, ie PATH is $PATH:(new value).

@david-ragazzi
Copy link
Contributor Author

New version of Readme on my repo:
https://github.com/david-ragazzi/nupic/blob/master/README.md

@breznak
Copy link
Member

breznak commented Dec 17, 2013

this line: "cmake ../$REPOSITORY" should be just cmake ..

@david-ragazzi
Copy link
Contributor Author

@breznak

this line: "cmake ../$REPOSITORY" should be just cmake ..

Have you tested this? If so, no problems.

But I belive the correct is:
cmake $REPOSITORY/source

@david-ragazzi
Copy link
Contributor Author

Replaced by #515

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants