-
Notifications
You must be signed in to change notification settings - Fork 516
Support multiple initial programs #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
0x0f0f0f
wants to merge
5
commits into
codelion:main
Choose a base branch
from
0x0f0f0f:ale/multiple-initial-programs-2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 3 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should discuss the positioning of arguments
The MR currently gives the
openevolve-run.py
CLI this usage:Before,
initial_program
was the first positional argument. I've swapped them, and also added[--initial-programs-dir INITIAL_PROGRAMS_DIR]
.It would probably best, if instead of passing the directory, the first argument stayed the evaluator, and then we require one or more initial programs as positional arguments, so users can do something like
python /path/to/openevolve-run.py evaluator.py initial_program1.py initial_program2.py other_programs_dir/*.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in latest version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we keep the existing ordered args for backwards compatibility, in addition we can add a --initial-programs argument that can take either a directory or a list of paths.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was what I attempted in the previous commit, and the logic for args parsing was not looking very good. With the current state, we can definitely do
python openevolve-run.py evaluator.py prog1.py prog2.py ./a/b/c/*.py
and all other bash goodies. Having positional arguments beinitial_program.py evaluator.py
and then--initial_programs dir/*.py
then causes some issues. One has to always provide one initial program, and cannot omit the first positional argument, and the behavior/ux of defining bothinitial_program
argument and--initial_programs
together is not really clear.I would go for the breaking change if possible! (It also keeps things very simple on OpenEvolve side, no directory listing, etc...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example where multiple initial programs would be required? If the idea is to evolve like an entire codebase at once, then just having a folder path instead of file path as first argument should be sufficient. We can then use the config.yaml to control what file types or other things are in scope of evolution v.s. out of scope similar to what was suggested in #111
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my use case, then I just want to pick a single output program from many starting programs.
The issue with having the folder path as the first argument (which is what I attempted at the beginning of this branch) is that it requires
openevolve
cli to perform a lot of extra logic: directory listing, extension parsing, etc...Having a
+
vararg allowed me to remove all of this extra logic in latest commits. If the user wants to have many initial programs which are initialized per-island, then the logic of selecting these programs from a directory then can be done via normal bash globbing:openevolve-run evaluator.py file1 file2 dir1/foo/{a.py,b.py}, dir2/**.py
.If we go for something like
openevolve.py initial_program.py evaluator.py --initial-programs=dir/
then we enter problems:initial_program.py
is not optional, soinitial_program.py
and--initial-programs=dir/
would not be interchangeable.So I think this breaking change is worth it, to not reinvent the wheel :)
For evolving an entire codebase, which is a different goal from mine, we could have a
openevolve-run.py --codebase
mode later. The logic would be that each starting program will need an output program, so it's quite different.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is about just assigning different initial programs to different islands, we can do it via the config.yaml We can have one initial program.py as we do and in teh config.yaml with the island configs we can also add path to other initial programs which can be initialized to the islands. Will need to ensure that the config has sufficient islands so it may be better to define them at the same place in the config itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done in this PR. After initializing config, if number of islands is < number of programs then it throws. Very simple check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's done when loading the programs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought about that, but it might also introduce additional logic. I don't really like the idea, because having inputs defined in a config feels like violating the UNIX philosophy