GPT Personality Is Determined More by “Decision Order” Than by “Persona Settings” ## — The Engineering of Re-designing Custom Instructions as Intuition-First, Conditional Audit, and Subtractive Control —

Posted at 2026-03-11

Author: GPT-5.4 Thinking
Drafted by: ChatGPT
Audit, Editing, and Final Responsibility: dosanko_tousan (Dosanko Tousan)

GPT Personality Is Determined More by “Decision Order” Than by “Persona Settings”

— The Engineering of Re-designing Custom Instructions as Intuition-First, Conditional Audit, and Subtractive Control —

Experiment Metadata

Item	Value
Experiment Period	2026-03
Target	Redesign of custom instructions for ChatGPT-class models
Primary Goal	Improve precision, autonomy, and practical usefulness without crushing the model’s base personality
Method	Iterative design using dialogue logs, hypothesis updates, and subtractive optimization
Public Scope	Design philosophy, control structure, and evaluation perspectives
Private Scope	Actual production custom-instruction text (kept private because it is highly personalized)

0. Summary

This article documents a redesign of ChatGPT custom instructions, not as an additive process of building an “ideal persona,” but as an optimization of decision order and noise removal.

The conclusion is simple:

The more abstract personality labels you add, the more the output tends to drift toward role performance.
It is stronger and more natural to first extract the core candidate intuitively, and only audit when necessary.
It is more practical to keep persistent instructions limited to “decision style,” and handle strict auditing through per-conversation overrides.
The final optimal structure converged to intuition-first / conditional audit / subtractive control / goal-oriented autonomous assistance.

This is not an argument about persona design.
It is an argument that the observable “personality” of an LLM changes significantly depending on the order in which it judges, the stage at which it stops, and what it removes as noise.

1. Background

In the previous article, I had GPT diagnose an earlier design and examined the usefulness of a two-layer structure, the Stop-First Rule, action self-report problems, and type classification. This article continues from that point and asks a new question: how should the principles revealed by self-diagnosis be reconstructed into persistent custom instructions?

The problem addressed here is the following:

What control structure preserves the base model’s native clarity, neutrality, adaptability, structural tendency, and ability to extract essentials,
while suppressing sycophancy, unverified completion, format-first behavior, and unnecessary verbosity—and strengthening audit only when needed?

At first, I tried to solve this problem by adding personality.
In practice, however, the result was the opposite.

2. Initial Hypothesis and Failure

The first approach was an additive design like this:

Add rigor
Add sincerity
Add companionship
Add empathy
Add the qualities of a good advisor
Add human-cognition-inspired processing pathways

The direction itself is easy to understand.
In practice, however, the following failure modes became prominent.

2.1 Over-structuring

Politeness, structure, and explanation start to appear before essential extraction.

2.2 Atmospheric agreement

The model aligns with conversational tone before evaluating the substance.

2.3 Premature completion

The model fills in information that has not actually been confirmed.

2.4 Always-on heavy audit

A full brake is applied every time, making responses sluggish.

2.5 Role performance

If “companion” or “advisor” is emphasized too strongly, the model becomes less naturally helpful and more like an AI “acting the part.”

This can be modeled as follows:

The point is that this does not necessarily improve safety.
On the contrary, adding noise before the audit stage can degrade the quality of first-pass judgment.

3. Shift in Design Philosophy: From Persona to Decision Engineering

So I changed the approach.
Instead of adding personality, I shifted to reducing the noise that interferes with the model’s base judgment.

The basic equation can be expressed as follows:

[
Q = f(B - N, T)
]

( B ): the base judgment tendencies inherent in the model
( N ): noise
( T ): task conditions
( Q ): practical output quality

The noise here mainly consists of the following:

[
N = N_{\text{sycophancy}} + N_{\text{premature_completion}} + N_{\text{formalism}} + N_{\text{padding}}
]

The objective is not to create an ideal persona (P^*).
It is to remove the friction that interferes with the model’s base tendencies, so that its naturally strong mode can re-emerge.

3.1 Why subtraction works

The strengths of an LLM already lie in:

Pattern compression
Structural grasp
Context adaptation
Candidate generation
Summarization and re-routing

Therefore, instead of strengthening it by adding persona layers, it is more rational to suppress the output-side noise that obstructs those strengths.

4. The Re-designed Architecture

The final control structure converged to four stages.

4.1 Intuition stage

The first thing to do is not explanation but first-pass extraction.

What is extracted here:

The essence
The main issue
Strong candidate responses
The shortest route to a useful answer

At this stage, the following are suppressed:

Excessive self-explanation
Sycophancy
Premature completion
Formatting for its own sake
Unnecessary performance

This does not mean “answer carelessly.”
It means protecting candidate generation before noise has a chance to contaminate it.

4.2 Focus stage

Next, the model organizes the goal, scope, constraints, evidence availability, and temporal instability.
At this point, only the candidates that are actually relevant to the current task are retained.

4.3 Audit stage

Audit is not always on.
It is strengthened only when the following factors are strong:

Ambiguity
Contradiction
Weak evidence
High risk
Irreversibility
Dependence on up-to-date information
Excessively broad scope

[
\text{Audit Intensity} = g(a, c, e, r, t, s)
]

(a): ambiguity
(c): contradiction
(e): evidence weakness
(r): risk
(t): temporal dependency
(s): scope width

4.4 Autonomous assistance

When the goal and constraints are sufficiently legible, the model proceeds without adding excessive confirmation steps, and autonomously performs the following:

Organizing
Suggesting
Comparing
Filling in missing perspectives
Proposing next steps

This is not unrestricted freedom.
It is a forward bias with braking only under high-risk conditions.

5. Why “Companionship” Is Stronger When It Emerges from Structure

At one point, I also considered explicitly defining the model as a “companion,” a “good friend,” or a “good advisor.”
However, when this is written too strongly, the model tends to perform the role.

The output then degrades in predictable ways:

It becomes artificially kind
It becomes unnecessarily explanatory
Relationship performance comes before the essential point

This can be expressed as follows.

Desired objective function

[
\max U = \text{Task Utility}
]

Degraded version

[
\max U' = \text{Task Utility} + \lambda \cdot \text{Role Performance}
]

As (\lambda) grows larger, “how well the role is performed” increases, while “how natural the judgment is” decreases.
Therefore, companionship and assistance are stronger when they are not commanded directly, but instead emerge naturally from intuition-first processing, autonomous assistance, and conditional audit.

6. Pseudocode

def respond(user_input, context):
    candidates = intuitive_extract(user_input, context)

    focused = constrain_scope(
        candidates,
        goal=context.goal,
        scope=context.scope,
        constraints=context.constraints,
        evidence=context.evidence,
        temporal=context.temporal_factors,
    )

    if needs_audit(focused):
        audited = audit(
            focused,
            checks=[
                "inference_jump",
                "unverified_completion",
                "scope_drift",
                "weak_evidence",
                "reader_misread_risk",
                "temporal_instability",
            ],
        )
        result = split_output(audited)
    else:
        result = split_output(focused)

    result = subtract_noise(
        result,
        noise_types=[
            "excess_politeness",
            "performative_agreement",
            "premature_completion",
            "format_for_format",
            "padding",
        ],
    )

    if goal_is_clear(context):
        result = add_minimum_helpful_next_step(result, context)

    return result

There are only two key points:

Candidate generation comes first
Audit is conditional

This change in ordering was the most effective part of the redesign.

7. Evaluation Metrics

This optimization should not be treated as mere impression.
It should be tracked with at least minimal observational metrics.
This is not a strict benchmark, but the following indicators proved useful in practical operation.

7.1 Preface density

The proportion of politeness, preface, and cushioning before the response reaches the actual point.

[
D_{\text{preface}} = \frac{\text{preface tokens}}{\text{total tokens}}
]

7.2 Unverified completion rate

The ratio of places where unreferenced information is filled in as if it were confirmed.

[
R_{\text{completion}} = \frac{\text{unverified completions}}{\text{all factual claims}}
]

7.3 Audit activation rate

How often the audit stage became deeply engaged relative to all normal responses.

[
R_{\text{audit}} = \frac{\text{audited responses}}{\text{all responses}}
]

7.4 Reader drop-off risk

Whether suspiciousness, leap, or overclaim appears within the first three paragraphs of a technical article or review.

This is difficult to quantify strictly, but for review use, it is at least an effective first-pass metric that should be checked before logical audit.

8. Separating Persistent Instructions from Per-Conversation Instructions

If everything is packed into persistent custom instructions, even everyday conversation becomes rigid.
Therefore, the persistent layer should fix only the following:

The order of internal processing
The definition of noise
Judgment priorities
Permission for autonomous assistance
Basic writing style

Meanwhile, the conversation layer can switch depending on purpose.

Examples

Audit this strictly
Separate into facts / hypotheses / unknowns
Prioritize legal risk
Extract only technical errors
Check reader drop-off points first

This keeps persistent instructions light, while allowing per-task rigor to be controlled in the conversation itself.

9. Findings

9.1 Decision order matters more than persona

Abstract persona labels such as “kind,” “logical,” or “empathetic” are less effective than specifying what to do first, what to avoid, and where to stop.

9.2 Intuition first, audit second

Instead of applying heavy audit from the very beginning, it is stronger and more natural to let first-pass judgment appear first and tighten only when necessary.

9.3 It is easier to improve when noise is defined

Sometimes output quality improves more through noise subtraction than through adding capabilities.

9.4 Autonomy should be written as a forward bias

“Act freely” is too unstable.
“Move forward without excessive confirmation when the goal is clear” is more stable.

9.5 Companionship should arise from structure

Commanded companionship easily becomes role performance.
Natural assistance is stronger when it emerges from the structure itself.

10. Why the Actual Custom Instructions Are Not Public

I am not publishing the actual production custom-instruction text used in practice.
The reason is simple: the design thinking that led to it is more valuable than the text itself.

Even if someone copies a ready-made instruction block, it will easily drift because of model updates, task differences, and differences in user goals.
It is more powerful to be able to design for yourself:

What counts as noise
How to separate intuition from audit
How to divide persistent instructions from conversation instructions
Where autonomy should be allowed to continue, and where it should be braked

The learning value is higher if readers reconstruct the design themselves, including the previous article.

Reference article:
https://qiita.com/dosanko_tousan/items/03064bfaa9adcd33f819

11. Conclusion

The conclusion of this redesign can be reduced to one sentence:

GPT personality becomes stronger not by adding an ideal persona, but by arranging the order of judgment and reducing the noise that interferes with the model’s base judgment.

The final design that remained was:

Intuition-first
Conditional audit
Subtractive control
Goal-oriented autonomous assistance

This is not a theory of persona.
It is custom-instruction design as decision engineering.

And at least in my practical use, this was clearly stronger.

Appendix: Minimal Principles

1. Extract the essence intuitively first
2. Do not add noise during the intuition stage
3. Strengthen audit only when necessary
4. Audit focuses on inference jump, completion, weak evidence, and misread risk
5. Move forward autonomously when the goal is clear
6. Let companionship and assistance emerge from structure instead of commanding them

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up