2. Getting started 开始
You’re going to need Python 3.10 or higher, along with your favorite text editor. We won’t need third party packages or virtualenvs, or anything besides a regular Python interpreter: everything we need is in Python’s standard library.
你需要 Python 3.10 或更高版本,以及你喜欢的文本编辑器。我们不需要第三方包或虚拟环境,任何标准的 Python 解释器都可以满足需求。
We’ll split the code into two files:
我们将代码分为两个文件:
An executable, called
wyag
;A Python library, called
libwyag.py
;一个可执行文件,名为
wyag
;一个 Python 库,名为
libwyag.py
;
Now, every software project starts with a boatload of boilerplate, so let’s get this over with.
每个软件项目开始时都会有很多样板代码,让我们尽快完成这部分。
We’ll begin by creating the (very short) executable. Create a new file called wyag
in your text editor, and copy the following few lines:
首先创建一个(非常简短的)可执行文件。在文本编辑器中创建一个新文件,命名为 wyag
,并复制以下几行:
#!/usr/bin/env python3
import libwyag
libwyag.main()
Then make it executable:
然后使其可执行:
$ chmod +x wyag
you’re done!
完成了!
Now for the library. it must be called libwyag.py
, and be in the same directory as the wyag
executable. Begin by opening the empty libwyag.py
in your text editor.
接下来是库文件。它必须命名为 libwyag.py
,并与 wyag
可执行文件位于同一目录中。首先在文本编辑器中打开空的 libwyag.py
文件。
We’re first going to need a bunch of imports (just copy each import, or merge them all in a single line)
我们首先需要导入一些模块(可以逐一复制每个导入,或合并成一行):
Git is a CLI application, so we’ll need something to parse command-line arguments. Python provides a cool module called argparse that can do 99% of the job for us.
Git 是一个命令行应用程序,因此我们需要解析命令行参数的工具。Python 提供了一个很棒的模块名为 argparse,可以为我们完成 99% 的工作。
pythonimport argparse
We’ll need a few more container types than the base lib provides, most notably an
OrderedDict
. It’s in collections.我们还需要一些基本库中没有的容器类型,特别是
OrderedDict
,它在 collections 中。pythonimport collections
Git uses a configuration file format that is basically Microsoft’s INI format. The configparser module can read and write these files.
Git 使用的配置文件格式基本上是微软的 INI 格式。可以使用 configparser 模块读取和写入这些文件。
pythonimport configparser
We’ll be doing some date/time manipulation:
我们还会进行一些日期/时间的操作:
pythonfrom datetime import datetime
We’ll need, just once, to read the users/group database on Unix (
grp
is for groups,pwd
for users). This is because git saves the numerical owner/group ID of files, and we’ll want to display that nicely (as text):需要一次性读取 Unix 的用户/组数据库(
grp
用于组,pwd
用于用户)。这是因为 Git 保存文件的所有者/组 ID,我们希望将其以文本形式美观地显示出来:pythonimport grp, pwd
To support
.gitignore
, we’ll need to match filenames against patterns like *.txt. Filename matching is in…fnmatch
:为了支持
.gitignore
,我们需要匹配如 *.txt 的文件名模式。文件名匹配功能在fnmatch
中:pythonfrom fnmatch import fnmatch
Git uses the SHA-1 function quite extensively. In Python, it’s in hashlib.
Git 广泛使用 SHA-1 函数。在 Python 中,它位于 hashlib 中。
pythonimport hashlib
Just one function from math:
只需要使用 math 中的一个函数:
pythonfrom math import ceil
os and os.path provide some nice filesystem abstraction routines.
os 和 os.path 提供了一些很好的文件系统抽象例程。
pythonimport os
we use just a bit of regular expressions:
我们还需要使用一些正则表达式:
pythonimport re
We also need sys to access the actual command-line arguments (in
sys.argv
):另外需要 sys 来访问实际的命令行参数(在
sys.argv
中):pythonimport sys
Git compresses everything using zlib. Python has that, too:
Git 使用 zlib 进行所有内容的压缩。Python 中也有 这个功能:
pythonimport zlib
Imports are done. We’ll be working with command-line arguments a lot. Python provides a simple yet reasonably powerful parsing library, argparse
. It’s a nice library, but its interface may not be the most intuitive ever; if need, refer to its documentation.
导入完成。我们将频繁处理命令行参数。Python 提供了一个简单但功能强大的解析库 argparse
。这是一个不错的库,但其接口可能并不是最直观的;如果需要,可以参考其 文档。
argparser = argparse.ArgumentParser(description="最简单的内容跟踪器")
We’ll need to handle subcommands (as in git: init
, commit
, etc.) In argparse slang, these are called “subparsers”. At this point we only need to declare that our CLI will use some, and that all invocation will actually require one — you don’t just call git
, you call git COMMAND
.
我们需要处理子命令(如 git 中的 init
、commit
等)。在 argparse 的术语中,这些被称为“子解析器”。此时我们只需声明我们的 CLI 将使用子解析器,并且所有调用都必须包含一个——你不能只调用 git
,而是要调用 git COMMAND
。
argsubparsers = argparser.add_subparsers(title="Commands", dest="command")
argsubparsers.required = True
The dest="command"
argument states that the name of the chosen subparser will be returned as a string in a field called command
. So we just need to read this string and call the correct function accordingly. By convention, I’ll call these functions “bridges functions” and prefix their names by cmd_
. Bridge functions take the parsed arguments as their unique parameter, and are responsible for processing and validating them before executing the actual command.
dest="command"
参数表示所选择的子解析器的名称将作为字符串返回,存储在名为 command
的字段中。因此,我们只需读取这个字符串并相应地调用正确的函数。按照惯例,我将这些函数称为“桥接函数(bridges functions)”,并以 cmd_
为前缀。桥接函数将解析的参数作为唯一参数,并负责处理和验证它们,然后执行实际命令。
def main(argv=sys.argv[1:]):
args = argparser.parse_args(argv)
match args.command:
case "add" : cmd_add(args)
case "cat-file" : cmd_cat_file(args)
case "check-ignore" : cmd_check_ignore(args)
case "checkout" : cmd_checkout(args)
case "commit" : cmd_commit(args)
case "hash-object" : cmd_hash_object(args)
case "init" : cmd_init(args)
case "log" : cmd_log(args)
case "ls-files" : cmd_ls_files(args)
case "ls-tree" : cmd_ls_tree(args)
case "rev-parse" : cmd_rev_parse(args)
case "rm" : cmd_rm(args)
case "show-ref" : cmd_show_ref(args)
case "status" : cmd_status(args)
case "tag" : cmd_tag(args)
case _ : print("无效命令。")