You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: optillm/cepo/README.md
+2-18Lines changed: 2 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ If you have any questions or want to contribute, please reach out to us on [cere
6
6
7
7
## CePO Methodology
8
8
9
-
In CePO, the Best of N technique is applied to `bestofn_n` solution candidates. Each solution is generated through the following four steps:
9
+
In CePO, the Best of N technique is applied to `bestofn_n` solution candidates. Optionally (when `cepo_use_plan_diversity` is set to `True`), the model will attempt to come up with diverse approaches for each of best of n completions. Each completion is generated through the following four steps:
10
10
11
11
**Step 1**: Plan Generation
12
12
The model generates a detailed, step-by-step plan to solve the problem, along with its confidence level for each step.
@@ -25,20 +25,4 @@ The model uses the refined plan from Step 3 to produce the final answer.
25
25
26
26
## CePO Current Status
27
27
28
-
This project is a work in progress, and the provided code is in an early experimental stage. While the proposed approach works well across the benchmarks we tested, further improvements can be achieved by task-specific customizations to prompts.
29
-
30
-
## CePO Ablation studies
31
-
32
-
We conducted ablation studies to evaluate the impact of various hyperparameters in the CePO framework. Our results indicate that the chosen hyperparameter settings strike a good balance between computational cost and accuracy.
33
-
34
-
Interestingly, the self-critique and quality improvement capabilities of existing off-the-shelf models do not always scale proportionally with increased inference compute. Addressing this limitation remains a key focus, and we plan to explore custom model fine-tuning as a potential solution in the future.
This project is a work in progress, and the provided code is in an early experimental stage. While the proposed approach works well across the benchmarks we tested, further improvements can be achieved by task-specific customizations to prompts.
0 commit comments