Nansen

在 macOS 上安装 Rodin 的步骤(M1 ARM 架构):

  1. 安装 x86 JDK

  2. 下载并安装 Rodin

  3. 修复 macOS 安全权限

  • 运行以下命令以绕过”应用程序已损坏”的错误:
1
sudo xattr -cr /Applications/Rodin.app

Rodin 应该可以正常运行,无需其他步骤(例如配置 Java VM 路径)。

Steps to Install Rodin on macOS (M1 ARM Architecture):

  1. Install x86 JDK
    • Download the Intel x86 JDK 18 (macOS .dmg) from Oracle:

      jdk-18.0.2.1_macos-x64_bin.dmg

    • Run the installer; it automatically configures the Java environment.

  2. Download and Install Rodin
  3. Fix macOS Security Permissions
  • Run this command to bypass “app is damaged” errors:
1
sudo xattr -cr /Applications/Rodin.app

Rodin should be able to run without any additional steps (e.g., configuring the Java VM path).

这篇文章主要介绍了如何对传统分组密码算法 FEAL-4 进行差分分析(Differential Cryptanalysis),并展示了攻击进程和关键实现细节。文中还包含了部分实验数据和攻击结果,可供有兴趣的读者参考或复现。


两张重要图片

在正式展开分析之前,先给出两张与 FEAL-4 密切相关的图片,方便对差分传播和整体加密流程有一个直观认识。

  1. 差分传播示意图
    下图展示了在 FEAL-4 中,明文差分(XOR 形式)如何在加密轮次中传播并最终反映到密文上。

  2. FEAL-4 工作流示意图
    下图是 FEAL-4 的整体加密框架示意,包括明文的左右分块、各轮子密钥注入以及 F 函数的运用流程。


F 函数的特性

FEAL-4 的轮函数(即 F 函数)可以视作对 32 位输入进行某种随机化排列(Permutation)。在差分分析中,我们并不需要 F 的具体实现,而更关注它的两个关键差分属性:

  • ,则
  • ,则

有了这两个结论,在后续分析时就能推断特定差分输入下的输出表现,从而帮助定位子密钥。


差分路径的详细分析

假设我们选择一对特殊的明文 ,满足:

这个特定差分向量有助于简化 FEAL-4 若干轮次输出中的差分计算。

1. 初始轮次差分推导

  • 表示 的左右分块(同理 表示 的左右分块)。

  • 经过第一轮的异或操作后:

    同理可推出右半部分的差分。

  • 进一步由于加密时存在 “额外一轮 XOR”,可得到:

    这意味着在后续计算中,差分会以更简单的形式得以归并。

2. 后向差分计算(Backward Calculation)

当我们获得明文对 与对应的密文对 后,可以进一步进行后向推导。令密文记作 ,则:

  • 计算出 ,得到
  • 根据已知差分性质,可在末轮逐步还原 等中间变量。
  • 利用这些结果便可对各子密钥展开攻击或验证。

攻击 K3

在差分分析中,往往先从某个中间轮次子密钥开始,例如 。具体步骤如下:

  1. 选取明文对: 我生成了 12 对满足

    的随机明文。
  2. 加密并提取差分: 将上述明文对加密后,得到相应的密文对;结合差分路径推导出 等中间值。
  3. 遍历

    只要找到满足差分方程 的候选即保留。

通过这种方法,我成功得到四个候选子密钥:

1
cfa38976, cfa309f6, 4f238976, 4f2309f6

执行以上过程的命令示例为:

1
./main -mode=attackk3 -file=K3.txt

该实现与我在 www.theamazingking.com 上介绍的方法一致。


攻击 K2

在拿到 的若干候选后,可以继续推算 。其核心方程:

其中 来自于上一轮运算(包括对 等的进一步计算)。同样地:

  1. 生成明文对,满足差分条件
  2. 抓取加密后密文差分,通过遍历所有可能 值,检查是否能满足差分约束
  3. 命令示例:
    1
    ./main -mode=attackk2 -file=K2.txt -k3=cfa38976,cfa309f6,4f238976,4f2309f6
  4. 最终输出示例如下所示:
    1
    2
    3
    4
    5
    Candidate K2: 8b722e41 (K3: cfa38976)
    Candidate K2: 8b72aec1 (K3: cfa38976)
    ...
    Candidate K2: 89722e43 (K3: 4f2309f6)
    Candidate K2: 8972aec3 (K3: 4f2309f6)

攻击 K1

类似地,攻击 也可以按前面流程来做。
为了让差分路径能稳定呈现预期结果,我生成了 12 对满足

的明文,并验证其加密输出。

命令示例:

1
./main -mode=attackk1 -file=K1.txt -k3k2="cfa38976,8b722e41; cfa38976,8b72aec1; ..."

可一次性将前面得到的所有 候选组合带入,最终找到对应的 候选。


攻击 K0, K4, K5

全部到手后,就可以反推最后一轮所需的
设最终加密得到的左、右部分为 。对应的明文分块为 。则有:

只要枚举所有可能的 ,即可通过上述方程解出相应的 。然后用其他明文-密文对交叉验证正确性。若都满足,则得到正确的最终密钥集。

命令示例:

1
./main -mode=attackk0k4k5 -k3k2k1="..." -file=K1.txt -file2=K1_p.txt

实现、优化与验证

1. 自动化生成与并行计算

  • 自动生成随机明文对:在代码中事先控制 XOR 差分,即可批量输出满足需求的明文文件(如 K3_p.txt, K2_p.txt, K1_p.txt)。
  • 并行化搜索:使用 Go 语言的 Goroutines,开启 10 个并行任务,在带有 10 核的 Mac 上最大化利用 CPU 资源。
  • 进度条与可视化:通过 progressbar 库实时显示搜索进度,让实验过程更直观。

2. 验证方法

  • 本地自定义密钥测试
    先选定一套已知的 (如 0x00000001, 0x00000002, 0x00000004, 0x00000008, 0x00000010, 0x00000020),对前面生成的明文对进行加密,并重复差分分析过程,看是否能在结果中匹配这套密钥。
    实验证明能够正确还原这组测试密钥,从而确认代码逻辑的准确性。

  • 特定明文-密文对测试
    选定一个明文 1234567890abcdef 和在 Einstein Zone 生成的密文 f43ae3eeb56e2bbf,验证最终得到的 256 组候选 中,每一组都可重现这条加密映射,进一步确认攻击流程无误。


最终结果

所有解出的 存放于 final_result.txt。去重后包括以下范围:

Possible Values for K0

1
2
890c2148 890ca1c8 098c2148 098ca1c8 ...
0b8c214a 0b8ca1ca 8b0c214a 8b0ca1ca

Possible Values for K1

1
2
471f077e 471f87fe c79f077e c79f87fc ...
451f077c 451f87fc c59f077c c59f87fc

Possible Values for K2

1
2
8b722e41 8b72aec1 8b722e43 8b72aec3 ...
89722e43 8972aec3

Possible Values for K3

1
cfa38976 cfa309f6 4f238976 4f2309f6

Possible Values for K4

1
89eb0024 89eb0026 8beb0024 8beb0026

Possible Values for K5

1
b85e6bc0 b85e6bc2 ba5e6bc0 ba5e6bc2

借助差分分析,我们最终能获得一批子密钥候选集。与现代分组密码(AES 等)相比,FEAL-4 的轮数较少、结构也更为简洁,因此在教学和研究中非常适合用于演示差分分析思路。


总结

本文详细展示了对 FEAL-4 进行差分攻击的完整流程,包括从差分路径设计、生成明文对、中间子密钥 () 攻击,再到最后对 的逆向求解。同时也介绍了代码实现过程中的几处优化和验证方法。
FEAL-4 算法因为较低的轮数和简单的结构为差分分析提供了便利。然而对于现代更安全、更复杂的分组算法,同样的分析思路也至关重要。本案例希望对密码学的学习者和研究者有所启发,如有任何问题或改进方案,欢迎在评论区讨论交流!

This article primarily introduces how to perform differential cryptanalysis on the traditional block cipher algorithm FEAL-4, and demonstrates the attack process and key implementation details. The text also includes some experimental data and attack results for interested readers to reference or reproduce.


Two Important Images

Before delving into the analysis, here are two images closely related to FEAL-4 to provide an intuitive understanding of differential propagation and the overall encryption process.

  1. Differential Propagation Diagram
    The figure below shows how plaintext differences (in XOR form) propagate through the encryption rounds in FEAL-4 and ultimately reflect in the ciphertext.

  2. FEAL-4 Workflow Diagram
    The following figure illustrates the overall encryption framework of FEAL-4, including the left and right blocks of the plaintext, the injection of round subkeys, and the usage flow of the F function.


Characteristics of the F Function

The round function of FEAL-4 (i.e., the F function) can be viewed as some form of random permutation on a 32-bit input. In differential analysis, we do not need the specific implementation of F, but rather focus on its two key differential properties:

  • If , then .
  • If , then .

With these two conclusions, we can infer the output behavior under specific differential inputs in subsequent analyses, thereby aiding in locating the subkeys.


Detailed Analysis of Differential Paths

Suppose we choose a pair of special plaintexts and such that:

This specific differential vector helps simplify the differential computations in several rounds of FEAL-4 outputs.

1. Derivation of Initial Round Differences

  • Let denote the left and right blocks of (similarly, denotes those of ).

  • After the XOR operation in the first round:

    Similarly, the differential of the right half can be derived.

  • Furthermore, due to the “additional XOR round” during encryption, we obtain:

    This implies that in subsequent computations, the differences can be merged in a simpler form.

2. Backward Differential Calculation

After obtaining the plaintext pair and the corresponding ciphertext pair , further backward derivations can be made. Let the ciphertext be denoted as , then:

  • Compute , obtaining .
  • Based on the known differential properties, gradually restore intermediate variables such as , , , etc., in the last round.
  • Using these results, attacks or verifications on the subkeys can be performed.

Attacking K3

In differential analysis, one often starts with attacking an intermediate round subkey, such as . The specific steps are as follows:

  1. Select Plaintext Pairs: I generated 12 pairs of random plaintexts satisfying
  2. Encrypt and Extract Differences: Encrypt the above plaintext pairs to obtain the corresponding ciphertext pairs; combine with the differential path to derive intermediate values , , , etc.
  3. Traverse :

    Any candidate that satisfies the differential equation is retained.

Using this method, I successfully obtained four candidate subkeys:

1
cfa38976, cfa309f6, 4f238976, 4f2309f6

An example command to execute the above process is:

1
./main -mode=attackk3 -file=K3.txt

This implementation is consistent with the method I introduced on www.theamazingking.com.


Attacking K2

After obtaining several candidates for , we can proceed to deduce . The core equation is:

where comes from the previous round operations (including further computations on , , etc.). Similarly:

  1. Generate plaintext pairs that satisfy the differential condition
  2. Capture ciphertext differences after encryption, and by traversing all possible values, check if the differential constraint is satisfied.
  3. Example command:
    1
    ./main -mode=attackk2 -file=K2.txt -k3=cfa38976,cfa309f6,4f238976,4f2309f6
  4. The final output example is as follows:
    1
    2
    3
    4
    5
    Candidate K2: 8b722e41 (K3: cfa38976)
    Candidate K2: 8b72aec1 (K3: cfa38976)
    ...
    Candidate K2: 89722e43 (K3: 4f2309f6)
    Candidate K2: 8972aec3 (K3: 4f2309f6)

Attacking K1

Similarly, attacking can be done following the previous process.
To ensure the differential path consistently presents the expected results, I generated 12 plaintext pairs satisfying

and verified their encryption outputs.

Example command:

1
./main -mode=attackk1 -file=K1.txt -k3k2="cfa38976,8b722e41; cfa38976,8b72aec1; ..."

This allows you to input all previously obtained candidate combinations of and at once, and finally find the corresponding candidates.


Attacking K0, K4, K5

Once , , and are all obtained, , , and required for the last round can be deduced.
Let the final encrypted left and right parts be , . The corresponding plaintext blocks are , . Then:

By enumerating all possible , the corresponding and can be solved through the above equations. Then, use other plaintext-ciphertext pairs to cross-validate correctness. If all are satisfied, the correct final key set is obtained.

Example command:

1
./main -mode=attackk0k4k5 -k3k2k1="..." -file=K1.txt -file2=K1_p.txt

Implementation, Optimization, and Verification

1. Automated Generation and Parallel Computing

  • Automatically Generate Random Plaintext Pairs: Control the XOR differences in the code in advance to batch output plaintext files that meet the requirements (e.g., K3_p.txt, K2_p.txt, K1_p.txt).
  • Parallelized Search: Use Go’s Goroutines to start 10 parallel tasks, maximizing CPU resource utilization on a 10-core Mac.
  • Progress Bars and Visualization: Use the progressbar library to display search progress in real-time, making the experimental process more intuitive.

2. Verification Methods

  • Local Custom Key Testing
    First, select a set of known (e.g., 0x00000001, 0x00000002, 0x00000004, 0x00000008, 0x00000010, 0x00000020), encrypt the previously generated plaintext pairs, and repeat the differential analysis process to see if this set of keys can be matched among the results.
    Experiments confirmed the ability to correctly restore this set of test keys, thereby verifying the accuracy of the code logic.

  • Specific Plaintext-Ciphertext Pair Testing
    Select a plaintext 1234567890abcdef and the ciphertext f43ae3eeb56e2bbf generated in the Einstein Zone, and verify that each of the 256 candidate sets can reproduce this encryption mapping, further confirming the attack process is correct.


Final Results

All deduced are stored in final_result.txt. After deduplication, they include the following ranges:

Possible Values for K0

1
2
890c2148 890ca1c8 098c2148 098ca1c8 ...
0b8c214a 0b8ca1ca 8b0c214a 8b0ca1ca

Possible Values for K1

1
2
471f077e 471f87fe c79f077e c79f87fc ...
451f077c 451f87fc c59f077c c59f87fc

Possible Values for K2

1
2
8b722e41 8b72aec1 8b722e43 8b72aec3 ...
89722e43 8972aec3

Possible Values for K3

1
cfa38976 cfa309f6 4f238976 4f2309f6

Possible Values for K4

1
89eb0024 89eb0026 8beb0024 8beb0026

Possible Values for K5

1
b85e6bc0 b85e6bc2 ba5e6bc0 ba5e6bc2

With the aid of differential analysis, we ultimately obtained a set of candidate subkeys. Compared to modern block ciphers (such as AES), FEAL-4 has fewer rounds and a simpler structure, making it very suitable for demonstrating the concept of differential analysis in teaching and research.


Conclusion

This article detailed the complete process of performing a differential attack on FEAL-4, including designing differential paths, generating plaintext pairs, attacking intermediate subkeys (, , ), and finally reverse-solving for , , . It also introduced several optimizations and verification methods during the code implementation process.
The FEAL-4 algorithm, due to its fewer rounds and simpler structure, provides convenience for differential analysis. However, for more secure and complex modern block algorithms, the same analytical approach remains crucial. This case study aims to inspire learners and researchers in cryptography. If you have any questions or improvement suggestions, feel free to discuss and exchange ideas in the comments section!

For DCU’s Secure Programming course, the disassembly problems have a certain pattern. Using a fixed approach to solving them can help achieve quick results.

Prerequisite Skills

  1. Familiarity with assembly commands Assembly Instructions
  2. Understanding % and $ for registers and immediate values $ and % Registers and Immediate Values
  3. Knowledge of direct and indirect addressing Direct and Indirect Addressing
  4. Familiarity with one example C Code to Assembly Example

Approach to Solving

  1. Identify the number of parameters
  2. Identify the number of local variables
  3. Recognize the loop body
  4. Analyze remaining code snippets
  5. Identify the return value

Identify the Number of Parameters

The position of ebp is the saved frame pointer, and ebp+4 is the return address. Since the problems typically assume all parameters are of type int or int*, ebp+8, ebp+c, and ebp+10 correspond to the first, second, and third parameters, respectively.

By quickly scanning the code for occurrences of 0x__(%ebp) and identifying the largest offset, the number of parameters can be determined as (offset - 4) // 4.

For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
push %ebp                   <foo+0>
mov %esp, %ebp <foo+1>
sub $0x4, %esp <foo+3>
mov 0x8(%ebp), %eax <foo+6>
mov %eax, -0x4(%ebp) <foo+9>
mov -0x4(%ebp), %eax <foo+12>
cmp 0x10(%ebp), %eax <foo+15>
jge <foo+32> <foo+18>
mov 0xc(%ebp), %eax <foo+20>
incl (%eax) <foo+23>
lea -0x4(%ebp), %eax <foo+25>
incl (%eax) <foo+28>
jmp <foo+12> <foo+30>
mov $0x0, %eax <foo+32>
leave <foo+37>
ret <foo+38>

Here, 0x10(%ebp) exists, so the parameter count is (16 - 4) / 4 = 3.

We can construct the framework of the code as:

1
2
3
int foo(int a, int b, int c) {

}

a, b, and c correspond to ebp+8, ebp+c, and ebp+10, respectively. Note that parameters are pushed onto the stack in reverse order, so the closer to ebp, the earlier the parameter appears in the list.

For now, assume all are int types. Adjust later if inconsistencies are found.

Identify the Number of Local Variables

The number of local variables is determined by the third line of the code: sub $0x4, %esp. The amount subtracted corresponds to the length of the allocated local variables.

In this example, sub $0x4, %esp indicates 4 bytes, so there is one local variable. Assume it is an int and name it i.

The code expands to:

1
2
3
int foo(int a, int b, int c) {
int i;
}

Recognize the Loop Body

Loops are typically while or for loops. To identify:

  1. Judgment Entry:

    • Look for a comparison instruction (e.g., cmp) followed by a jump instruction (e.g., jge or jle).
    • These indicate the start of a condition check.
  2. Loop Body:

    • Unconditional jmp instructions signify loops. The jump target is the beginning of the condition check.
  3. Condition:

    • The judgment condition combines the comparison and preceding instructions into a complete condition.

Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
push %ebp                   <foo+0>
mov %esp, %ebp <foo+1>
sub $0x4, %esp <foo+3>
mov 0x8(%ebp), %eax <foo+6>
mov %eax, -0x4(%ebp) <foo+9>
mov -0x4(%ebp), %eax <foo+12>
cmp 0x10(%ebp), %eax <foo+15>
jge <foo+32> <foo+18>
mov 0xc(%ebp), %eax <foo+20>
incl (%eax) <foo+23>
lea -0x4(%ebp), %eax <foo+25>
incl (%eax) <foo+28>
jmp <foo+12> <foo+30>
mov $0x0, %eax <foo+32>
leave <foo+37>
ret <foo+38>

Judgment Entry:

The combination of cmp and jge indicates a judgment entry.

Loop:

The jmp command jumps to <foo+12>, signifying the loop condition check.

Condition:

  • mov -0x4(%ebp), %eax: Assigns the value of i to eax.
  • cmp 0x10(%ebp), %eax: Compares eax (value of i) with c.

This calculates i - c and checks the condition with jge. In assembly, conditions are reversed compared to C: jge skips the loop if the condition is met. Thus, the C condition is i - c < 0.

The code updates to:

1
2
3
4
5
6
int foo(int a, int b, int c) {
int i;
while (i - c < 0) {

}
}

Analyze Remaining Code Snippets

Before the Loop:

1
2
mov 0x8(%ebp), %eax         <foo+6>
mov %eax, -0x4(%ebp) <foo+9>

These lines assign the value of a to i:

1
i = a;

Inside the Loop:

1
2
3
4
mov 0xc(%ebp), %eax         <foo+20>
incl (%eax) <foo+23>
lea -0x4(%ebp), %eax <foo+25>
incl (%eax) <foo+28>
  • mov 0xc(%ebp), %eax and incl (%eax) increment the value at the address stored in b:
1
(*b)++;
  • lea -0x4(%ebp), %eax and incl (%eax) increment i:
1
i++;

The updated code becomes:

1
2
3
4
5
6
7
8
int foo(int a, int *b, int c) {
int i;
i = a;
while (i - c < 0) {
(*b)++;
i++;
}
}

Identify the Return Value

In x86 calling conventions, return values are stored in the eax register.

1
mov $0x0, %eax

This indicates the function returns 0:

1
return 0;

Final Code

1
2
3
4
5
6
7
8
9
int foo(int a, int *b, int c) {
int i;
i = a;
while (i - c < 0) {
(*b)++;
i++;
}
return 0;
}

对于 DCU 安全编程这门课的反汇编,由于题目有一定套路,使用固定的解题思路,可以快速解题

前置技能

  1. 熟悉各种汇编命令 汇编指令
  2. 熟悉 % 和 $ 表示寄存器和立即数 $ 与 % 寄存器与立即数
  3. 熟悉直接寻址和间接寻址 直接寻址与间接寻址
  4. 了解一个case C语言代码到汇编的例子

解题思路

  1. 找到入参个数
  2. 找到局部变量个数
  3. 识别循环体
  4. 解析剩余代码片段
  5. 识别返回数据

找到入参个数

ebp的位置是saved frame ptr,ebp+4的位置是return address。由于题目往往是约定入参全部为int 或者 int * 类型,因此 ebp+8,ebp+c,ebp+10的位置,分别是入参的第一个参数,第二个参数,第三个参数。

因此,在代码中快速浏览,寻找 0x__(%ebp) 的字样,找到最大的偏移量。(偏移量-4)//4 的值就是入参的个数。

例如,下面这段代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
push %ebp                   <foo+0>
mov %esp, %ebp <foo+1>
sub $0x4, %esp <foo+3>
mov 0x8(%ebp), %eax <foo+6>
mov %eax, -0x4(%ebp) <foo+9>
mov -0x4(%ebp), %eax <foo+12>
cmp 0x10(%ebp), %eax <foo+15>
jge <foo+32> <foo+18>
mov 0xc(%ebp), %eax <foo+20>
incl (%eax) <foo+23>
lea -0x4(%ebp), %eax <foo+25>
incl (%eax) <foo+28>
jmp <foo+12> <foo+30>
mov $0x0, %eax <foo+32>
leave <foo+37>
ret <foo+38>

我们发现存在 0x10(%ebp) 因此,参数个数为(16-4)/4 ,也就是是3个。

因此我们可以写出代码的框架,如下

1
2
3
int foo(int a, int b, int c) {

}

其中 a, b, c 存在的栈的位置分别是 ebp+8,ebp+c,ebp+10

注意,参数从后向前入栈,因此越接近ebp,参数在参数列表中越靠前。

注意,我们这里先假设全部都是int类型,如果后续有遇到不一致的情况,我们再来修改

找到局部变量个数

局部变量的个数在代码第三行,sub $0x4, %esp 这里减去的数量,就是分配的局部变量的长度。

在上面的代码中,显然这里分配了4字节,也就是只有一个局部变量。我们假设这个局部变量就是int类型,起名为i。

我们继续扩展我们的代码

1
2
3
int foo(int a, int b, int c) {
int i;
}

识别循环体

循环体通常是while循环或者for循环。

对于常见的while命令,我们寻找下面相关的指令:

  • 判断入口:

判断入口是一个连续的比较命令(如cmp)和一个跳转命令(如jge或jle)组成。找到判断入口有助于我们识别代码中是否有判断或者循环。

  • 循环:

假如没有循环标志,那么就不是循环体,而是简单的if判断。循环是一个无条件的jmp命令。循环所跳转的位置就是判断条件的开始。

  • 判断条件:

判断条件是判断入口以及判断入口和前面几条命令组合成的完整判断条件。判断条件就是while括号里的内容。

现在我们查看下面的例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
push %ebp                   <foo+0>
mov %esp, %ebp <foo+1>
sub $0x4, %esp <foo+3>
mov 0x8(%ebp), %eax <foo+6>
mov %eax, -0x4(%ebp) <foo+9>
mov -0x4(%ebp), %eax <foo+12>
cmp 0x10(%ebp), %eax <foo+15>
jge <foo+32> <foo+18>
mov 0xc(%ebp), %eax <foo+20>
imul (%eax) <foo+23>
lea -0x4(%ebp), %eax <foo+25>
imul (%eax) <foo+28>
jmp <foo+12> <foo+30>
mov $0x0, %eax <foo+32>
leave <foo+37>
ret <foo+38>

判断入口:

我们寻找一个连续的判断指令和一个跳转指令,我们可以找到 cmp 和 jge 这两个连续指令,就说明这里是判断入口。

循环:

我们看到,最后存在一个jmp命令,jmp命令指向了第foo+12的位置,因此foo+12就是循环的判断条件。

判断条件:

判断条件的第一句是 mov -0x4(%ebp), %eax ,其中 -0x4(%ebp) 是最后一个局部变量,由于我们只有一个局部变量 i ,因此这句话的意思是,将本地变量i的值赋值给 eax。

我们看到cmp的命令是:cmp 0x10(%ebp), %eax ,其中cmp 0x10(%ebp) 指第三个入参的值。这句话的意思是计算 eax 寄存器中减去第一个入参的数值,也就是c。

联合起来,我们就知道,这里是计算 i - c的值,并交给 jge 来做跳转。

由于汇编的条件判断和C语言条件判断是反过来的,汇编是满足判断则跳过循环体,而C语言是满足条件则执行循环体,因此我们的判断条件也是反过来的。

jge表示大于等于0则跳过循环体,因此C语言的循环条件是小于0则执行循环体,也就是判断 i - c 是否小于 0

因此我们可以继续完善我们的代码:

1
2
3
4
5
6
int foo(int a, int b, int c) {
int i;
while (i - c < 0) {

}
}

解析剩余代码片段

首先标记下已经处理过的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
push %ebp                   <foo+0> 无用
mov %esp, %ebp <foo+1> 无用
sub $0x4, %esp <foo+3> 分配了几个局部变量
mov 0x8(%ebp), %eax <foo+6>
mov %eax, -0x4(%ebp) <foo+9>
mov -0x4(%ebp), %eax <foo+12> 判断条件开始
cmp 0x10(%ebp), %eax <foo+15> 判断内容
jge <foo+32> <foo+18> 判断条件结束
mov 0xc(%ebp), %eax <foo+20>
incl (%eax) <foo+23>
lea -0x4(%ebp), %eax <foo+25>
incl (%eax) <foo+28>
jmp <foo+12> <foo+30> 循环结束
mov $0x0, %eax <foo+32>
leave <foo+37>
ret <foo+38>
  1. 循环体之前的代码

我们看一下循环开始之前的部分:

1
2
mov 0x8(%ebp), %eax         <foo+6>
mov %eax, -0x4(%ebp) <foo+9>

我们发现这两句都涉及到了 eax 寄存器

对于多条涉及eax 寄存器的操作,我们尽量当作一个整体来看

我们发现,其实这里就是将 0x8(%ebp) 的数值 赋值给了 -0x4(%ebp) ,这两个值分别是 a 和 i

因此这个部分的代码是

1
i = a
  1. 循环体中间的代码
1
2
3
4
mov 0xc(%ebp), %eax         <foo+20>
incl (%eax) <foo+23>
lea -0x4(%ebp), %eax <foo+25>
incl (%eax) <foo+28>

对于涉及eax的多条指令,我们不要一条一条看,而是看作一个整体。

其中,mov 0xc(%ebp), %eaxincl (%eax) 可以看作 incl (%b) 注意,由于这里是间接寻址,因此,这意味着,b中存的是地址。因此我们需要修改我们的入参,b这里不是数值,而是地址。这句代码意味着

1
(*b)++ 

注意

*b++ 会先对 *b 解引用,再对 b 自增

因此这里括号不能省略,否则会有问题

而对于 lea -0x4(%ebp), %eaxincl (%eax) ,可以看作 incl (&i) ,这里继续是间接寻址,也就是对后者地址上的数值自增。这句代码意味着 *(&i)++ ,可以简化为

1
i++ 

之所以这么复杂,是因为汇编指令不能直接对地址进行自增操作,但是可以对一个包含地址的寄存器做自增操作。

因此,补充了剩余的代码片段,代码应该为

1
2
3
4
5
6
7
8
int foo(int a, int *b, int c) {
int i;
i = a;
while (i - c < 0) {
(*b)++;
i++;
}
}

识别返回数据

按照C语言的x86的调用约定,返回数据放在eax中。

1
mov $0x0, %eax

很显然,这里直接返回了0

最终代码

1
2
3
4
5
6
7
8
9
int foo(int a, int *b, int c) {
int i;
i = a;
while (i - c < 0) {
(*b)++;
i++;
}
return 0;
}

In a previous article, I used Obsidian’s QuickAdd to create a script that automatically converts text copied from ChatGPT and fixes the LaTeX formatting. However, there is no suitable plugin available for the Craft app.

We can use Raycast to achieve this functionality uniformly.


Create a Raycast Script

First, we need to create a script.

image

Next, select the Bash template.

Then, we edit the Bash script and enter the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/bash

# Required parameters:
# @raycast.schemaVersion 1
# @raycast.title Copy From ChatGPT
# @raycast.mode silent

# Optional parameters:
# @raycast.icon 🤖
# @raycast.packageName ChatGPT Utils

# Documentation:
# @raycast.description Copy From ChatGPT
# @raycast.author Nansen Li
# @raycast.authorURL nansenli.com

# Get clipboard content
clipboard_content=$(pbpaste)

# Check if content was successfully retrieved
if [ -z "$clipboard_content" ]; then
echo "Clipboard is empty or inaccessible."
exit 1
fi

# Process clipboard content
modified_content=$(echo "$clipboard_content" | \
sed 's/\\\[/$$/g; s/\\\]/$$/g; s/\\( /$/g; s/ \\\)/$/g')

# Write the modified content back to the clipboard
echo "$modified_content" | pbcopy

After creating the script, we need to add the directory containing the script to Raycast.

image

In this step, select the directory where the script was just created. At this point, we can see the newly created script in Script Commands.


How to Use

After copying a formula from ChatGPT, open the Raycast panel, find the newly created script, and run it. The clipboard content will be automatically fixed. Then, simply paste it into Obsidian or Craft.

在之前的文章中,我使用了 Obsidian 的 QuickAdd 来创建一个脚本,自动转换从 ChatGPT 中复制的文本,修复其中的 LaTeX 格式。然而,对于 Craft 这款应用,并没有合适的插件可以使用。

我们可以通过 Raycast 来实现这个功能的统一操作。


创建 Raycast 脚本

首先,我们需要创建一个脚本。

image

接着选择 Bash 模板。

然后,我们编辑这个 Bash 脚本,输入如下代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/bash

# Required parameters:
# @raycast.schemaVersion 1
# @raycast.title Copy From ChatGPT
# @raycast.mode silent

# Optional parameters:
# @raycast.icon 🤖
# @raycast.packageName ChatGPT Utils

# Documentation:
# @raycast.description Copy From ChatGPT
# @raycast.author Nansen Li
# @raycast.authorURL nansenli.com

# 获取剪贴板内容
clipboard_content=$(pbpaste)

# 检查是否成功获取内容
if [ -z "$clipboard_content" ]; then
echo "剪贴板为空或无法访问。"
exit 1
fi

# 处理剪贴板内容
modified_content=$(echo "$clipboard_content" | \
sed 's/\\\[/$$/g; s/\\\]/$$/g; s/\\( /$/g; s/ \\\)/$/g')

# 将修改后的内容写回剪贴板
echo "$modified_content" | pbcopy

创建完脚本后,我们还需要将脚本所在的目录添加到 Raycast 中。

image

在这一步中,选择刚刚创建的脚本目录。此时,我们可以在 Script Commands 中看到刚刚创建的脚本。


如何使用

在复制完 ChatGPT 的公式后,打开 Raycast 的面板,找到刚刚的脚本并运行,此时剪贴板中的内容就会被自动修复。接下来,只需将其粘贴到 Obsidian 或 Craft 中即可。

Background

I’m Nansen, and I participated in the 2024 Huawei Ireland Research Center Server Cluster Management Optimization Competition. Here, I’d like to share my experience in this competition and summarize some key takeaways.

Our algorithm code can be found here: huawei2024

Competition Results

We achieved first place in the algorithm section, scoring approximately 4%-5% higher than the second to fourth places, giving us a significant advantage. However, we faced considerable challenges in the presentation segment. First, we recognized that there is room for improvement in our English fluency. Second, we found that our presentation slides could be more polished and visually appealing. Lastly, we encountered some challenges with time management. Nevertheless, despite these obstacles, we still managed to secure third place overall.

Competition Process

The competition was divided into two stages. The first stage allowed ample preparation time. Once we settled on using the simulated annealing algorithm, we began developing it. The main difficulty in this stage was optimizing and understanding the requirements of the task. During development, we encountered numerous bugs, but after fixing them, our score improved significantly.

In the second stage, as the problem was released on the day of the competition, I continued optimizing the algorithm from the first stage, successfully increasing the evaluation speed by 1000 times. This significantly boosted our performance in the second stage, providing us with enough strength to vie for first place.

In the final round, our algorithm performed very consistently, and after some adjustments, we took a considerable lead over our competitors. However, because we didn’t put enough emphasis on making an effective presentation, we mistakenly believed that high algorithm performance alone would guarantee a top score, which proved to be wrong.

Lessons Learned

Algorithm Choice

Fortunately, I chose the right algorithm from the outset, and shortly after the problem was released, I devised a framework that suited the entire competition. However, I did take some wrong turns, such as attempting impractical algorithms like PPO. After initial trials failed, I should have moved on instead of wasting further effort. Given the limited time, we should focus on achievable optimal results within the shortest period rather than pursuing ideal but unrealistic solutions. It’s also crucial to recognize one’s limitations and concentrate on goals that are achievable in the available time.

Team Collaboration

Luckily, our team division was reasonable this time, and I did my best to ensure every member could contribute their value. One area for improvement is communicating more with team members to understand their ideas and preferences. Since I mainly handled the algorithm part, I had relatively little interaction with teammates, which I will work to improve next time.

Presentation Design

We didn’t anticipate that the level of presentation skills from other teams would be so high. My teammates speculated that some competitors might have a business background, giving them an advantage in crafting presentations. Moreover, they had five members in their team while we only had three, which put us at a disadvantage regarding manpower. These were objective challenges, but if we had paid more attention to creating our presentation, the first prize could have been within our reach.

Over-committing Leading to Imbalance

In the final round, our algorithm was already quite excellent, and our score surpassed the previously top-ranked team. However, I continued spending considerable time on further optimizations. Even though we were significantly ahead, this focus caused us to neglect the preparation of our presentation. In hindsight, I should have known when to stop and fully recognized the importance of balancing different aspects of the scoring criteria.

Conclusion

Participating in the Huawei Tech Arena 2024 competition provided me with invaluable experience. The competition highlighted our strengths, but also revealed areas where we need to improve in terms of showcasing skills and team collaboration. Looking ahead, I will keep these lessons in mind and strive to continuously improve myself in future competitions. If you have any questions, feel free to leave them in the comments section.

背景介绍

我是Nansen,参与了2024年华为爱尔兰研究中心的服务器集群管理优化比赛。在此,我想分享一下这次比赛的经历,并对其中的关键点进行总结。

我们的算法代码部分可以参考这里: huawei2024

比赛结果

我们在算法部分获得了第一名,分数比第二至第四名高出约4%-5%,取得了巨大的优势。然而,在演讲环节我们遇到了很大的挑战。首先,我们意识到英语表达方面有提升空间;其次,我们发现演示PPT还可以更精美突出;最后,在时间安排上也存在一些挑战。不过,尽管如此,我们还是在总分上取得了第三名的成绩。

比赛过程

比赛分为两个阶段。第一阶段有较长的准备时间。当我们确定使用模拟退火算法后,就开始了算法的开发。第一阶段的难点主要在于优化和理解题目需求。在开发过程中,我们遇到了许多bug,但在修复后,分数有了明显提升。

第二阶段,由于题目在比赛当天才发布,我继续优化第一阶段的算法,成功将评估算法的运行速度提升了1000倍。这显著提高了我们在第二阶段的表现,让我们有足够的实力争夺第一名。

在决赛阶段,我们的算法表现非常稳定,并在调整后大幅领先对手。然而,由于我们没有足够重视PPT的制作,我们以为只要算法表现好,排名靠前,就能获得高分,但事实证明我们错了。

经验和教训

算法选择

很幸运,在最初的阶段我就选对了最终的算法,并且在题目发布后不久就构思出了适用于整个比赛的算法框架。然而,我也走了一些弯路,比如尝试了一些不可行的算法(如PPO算法)。在初步尝试无果后,我应该及时停止,而不是继续浪费精力。由于时间有限,我们应追求最短时间内的最优效果,而不是追求一个不切实际、理想中的方案。同时,也要认清自己的能力边界,专注于在短时间内能够实现的目标。

团队分工

很幸运,这次团队的分工是合理的,我尽力确保每个成员都能发挥自己的价值。改进之处在于,应该更多地与队员沟通,了解他们的意愿和想法。由于我主要负责算法部分,与队友的沟通相对较少,下一次我会在这方面做得更好。

PPT的制作

我们没有预料到,其他参赛队伍的PPT水平如此之高。我的队友猜测,他们可能有商科背景,这使得他们在制作PPT时具有优势。此外,他们的团队有五名成员,而我们只有三名,这也让我们在人员配备上处于劣势。这些都是客观上的挑战,但如果我们更加重视PPT的制作,或许第一名就是我们的。

过分投入导致失衡

在决赛阶段,其实我们的算法已经非常出色,分数也超过了此前排名第一的队伍。然而,我仍然花了大量时间继续优化算法,尽管我们的算法分数领先对手很多,但却因此忽略了PPT的准备。事实上,我应该适可而止,并充分理解评分标准的重要性。

结论

通过参加华为Tech Arena 2024竞赛,我获得了宝贵的经验。这次比赛不仅展示了我们的优势,也暴露了我们在展示技能和团队协作方面需要改进的地方。展望未来,我会牢记这些经验教训,并在今后的比赛中不断改进自己。如果您有问题,可以在留言区反馈给我。

0%