korean | english |
---|---|
ko | en |
gemini cli (Requires prior installation of gemini cli)
Occasionally, a problem occurs where the agent does not follow the code of conduct. It is unclear whether this is an issue with the agent or if the code of conduct is not yet fully optimized. In such cases, you can stop it by pressing the
esc
key and then give a command like, "Proceed with the task according to the workflow of the code of conduct," and it will work well. (Users of the paid version of gemini cli should use it with caution as it can consume a lot of tokens.)
- Paste
GEMINI.MD
into the workspace folder. - Run
gemini
in the terminal or PowerShell. - Enter the following command: "Initialize the folder and file structure as specified in the code of conduct."
- Paste your materials into
assets
(list of materials for analysis and reference) andguidelines
(documents containing user know-how and detailed instructions for the agent regarding analysis/results). - Example of a task instruction: "Analyze the materials and write a report."
claude code (Requires prior installation of claude code)
I have not actually tested
claude code
. The following is written based on my personal hope and prediction thatgemini cli
will also support sub-agents in the future.
- Paste
CLAUDE.MD
and the.claude
folder into the workspace folder. - Run
claude
in the terminal or PowerShell. - The rest of the process is the same.
In the past, creating intelligent 'agents' was a domain exclusive to developers who spoke the language of computers: 'code'. But now, we are witnessing an incredible paradigm shift. We can now design agents that think and act for themselves, based on the language humans understand best: 'natural language'.
The 'Code of Conduct' lies at the heart of this innovative shift. It is living proof that we can build highly intelligent agents without complex coding, simply by defining clear procedures for thought and principles for action.
The outputs generated by agents created in this way sometimes demonstrate insights equal to, or even surpassing, those of experts in the field. Through this, we may be getting a small taste of a tiny piece of Artificial General Intelligence (AGI).
This entire journey began with a very practical question: "How can we increase satisfaction with the results from an Artificial Intelligence (LLM) from 10% to over 90%?"
Initially, when I gave an LLM a broad task like "Write a report about A," the results were disappointing. The content looked plausible, but it lacked depth and logical consistency. My personal satisfaction level was only about 10-50%.
The root of the problem was the 'approach'. When humans write a complex report, they don't write everything at once. They go through steps: finding resources, creating an outline, writing a draft, and revising.
Inspired by this simple fact, I began to break down the tasks given to the LLM into smaller, sequential requests, just like a human workflow.
- "Find 5 resources related to A."
- "Based on the resources found, create a table of contents for the report."
- "Write the body text for the first item in the table of contents."
Amazingly, by breaking down the work into these small units (Tasks), the satisfaction with the results soared to over 90%. By focusing on each step, the LLM produced far more accurate and consistent outputs.
Before the advent of agents like Gemini CLI, this idea was fleshed out through a manual experiment using multiple LLM chat windows simultaneously.
- Chat Window 1 (Planner): Sets the overall plan and defines the next Task.
- Chat Window 2 (Executor): Receives and executes the Task defined by the 'Planner'.
- Chat Window 3 (Prompt Engineer): Refines the prompt to help the 'Executor' produce the best possible results.
The most crucial part of this process was maintaining 'goal consistency'. The Planner always remembered the user's final instruction, and the Executor was told not just 'what' to do, but also 'why'. All work outputs were saved as files, becoming clear input data for the next step.
As I structured this manual workflow, I realized it bore a striking resemblance to an artificial neural network. The process where an external instruction (input) is transformed into a final output through multiple stages of planning and execution (hidden layers). This became the backbone of the 'Code of Conduct' architecture.
The emergence of Gemini CLI gave me the confidence that this entire process could be automated and was the decisive catalyst for systematizing this idea under the name 'Code of Conduct'.
The 'Code of Conduct' doesn't stop at optimizing the workflow of a single agent. Its true potential is revealed when it is extended to an 'Agent Society' where multiple agents interact. This is akin to building an intelligent system that operates not like a single competent expert, but like a well-organized team or company.
The core of the 'Code of Conduct's' scalability lies in Recursive Delegation. This is the concept where a higher-level agent hires another, subordinate agent that also follows a 'Code of Conduct' to solve its own Task
.
Category | Senior Agent (Manager) | Junior Agent (Practitioner) |
---|---|---|
Goal | Achieve a strategic task (e.g., market analysis report) | Solve a specific Task (e.g., summarize a particular article) |
Role | Decompose complex problems, define, and delegate sub-Tasks | Execute the clearly delegated Task according to the 'Code of Conduct' |
Interaction | Issues Task instructions and receives deliverables |
Creates and reports back with deliverables (files) |
This hierarchical structure mimics the most efficient collaboration method in human society: the 'organization'. The manager sees the big picture, and the practitioner focuses on the details, maximizing the efficiency and expertise of the entire system. In actual AI research, this concept is actively being studied under the name 'Hierarchical Agent Teams', with ChatDev, which simulates a virtual software company, being a prominent success story.
For an agent society to function smoothly, a clear method of communication is necessary. The 'Code of Conduct' is designed for agents to communicate through a Shared Memory called workspace
. The advantage here is that all information is clearly recorded in file form, allowing for transparent tracking of who did what.
Furthermore, you can define the use of external Tools to perform specific Tasks
.
- MCP (Model-Control-Program) Integration: By connecting an agent (or program) dedicated to structured tasks like database queries or external API calls as a tool, the LLM can focus more on creative work.
- Utilizing Specialized Agents: The overall system's expertise can be enhanced by calling specialized agents as tools, such as a 'Researcher Agent' for web searches or a 'Developer Agent' for code generation.
In this way, the 'Code of Conduct' presents a blueprint for a scalable and flexible ecosystem where agents and tools with diverse specializations are organically connected and collaborate.
The most significant implication of this model is that it is not just a theoretical model but a successful experiment that has actually built a functioning 'natural language-based backend system'.
The Code of Conduct we've created operates in a way that is strikingly similar to a traditional backend system.
- It receives a user's request (API call),
- Internal agents collaborate according to the 'Code of Conduct' workflow, saving results to a database, etc. (internal logic processing),
- And generates the final result as a file (response).
This is a case that proves it's possible to implement an intelligent automation system using only logical instructions and structured natural language, without a single line of traditional programming code. In other words, the 'Code of Conduct' has shown in reality that it can be both a 'blueprint' for complex software and an executable 'engine' in itself.
And the most astonishing part is that the design for this entire complex system, which in the past would have required countless lines of code, can now be drawn in 'natural language'. This signifies the democratization of AI development and a fundamental shift in the way we collaborate towards the AGI era.
The structural similarity between the 'Code of Conduct' and artificial neural networks goes beyond a simple analogy, prompting a provocative question:
"Could we reverse-engineer the abstract workflow model of an agent to design a physical artificial neural network architecture?"
This is an attempt to apply insights gained from software architecture (the Code of Conduct) to the design of hardware (or an equivalent neural network model), which could lead to innovative ideas like the following.
Currently, most LLMs are akin to a giant, monolithic architecture where all neurons are densely interconnected. However, the 'Code of Conduct' has clearly defined roles for each function, like Phase
or Stage
.
Applying this idea to neural networks, one can imagine a network composed of multiple small 'Expert Modules', each dedicated to a specific function.
- Language Understanding Module, Code Generation Module, Image Analysis Module, etc., would exist independently.
- A higher-level 'Router' or 'Coordinator' module would exist to control them, much like the 'Planner' in the 'Code of Conduct'.
- Depending on the type of task, the 'Router' would select and activate the most appropriate expert modules to solve the problem. This would reduce unnecessary computation, leading to a much more efficient and flexible architecture where specific modules can be easily replaced or upgraded.
The core of the 'Code of Conduct' is the 'Chain of Execution', where the result of a previous Task
influences the path and content of the next Task
. This provides a crucial insight into the flow of information within a neural network.
- Dynamic Routing: In current neural networks, the path of information flow is mostly fixed once the input data is given. However, by applying the 'Chain of Execution' concept, we could implement a 'dynamic neural network' where the information path is determined in real-time based on the input data and intermediate processing results.
- Conditional Activation: Not all neurons need to be active all the time. Only the optimal 'Chain of Execution'—that is, the optimal path of neurons required to solve the problem—would be conditionally activated. This is similar to how specific areas of the brain activate for specific tasks, and it would enable far more efficient and powerful reasoning.
The ability to conceive of such advanced neural network architectures 'not with code, but with natural language,' and to simulate their logical validity—this is the most innovative value that the 'Code of "Conduct' offers. We are no longer just users of AI; we are becoming the 'architects' who design the structure of future intelligence with our own language. Isn't this the most realistic way for us to get a 'taste' of the potential of AGI?
I am not in a position to develop AI models myself, but this makes me wonder if current AI models are not already being developed with such a structure.