Skip to content

2. Getting started 开始

You’re going to need Python 3.10 or higher, along with your favorite text editor. We won’t need third party packages or virtualenvs, or anything besides a regular Python interpreter: everything we need is in Python’s standard library.

你需要 Python 3.10 或更高版本,以及你喜欢的文本编辑器。我们不需要第三方包或虚拟环境,任何标准的 Python 解释器都可以满足需求。

We’ll split the code into two files:

我们将代码分为两个文件:

  • An executable, called wyag;

  • A Python library, called libwyag.py;

  • 一个可执行文件,名为 wyag

  • 一个 Python 库,名为 libwyag.py

Now, every software project starts with a boatload of boilerplate, so let’s get this over with.

每个软件项目开始时都会有很多样板代码,让我们尽快完成这部分。

We’ll begin by creating the (very short) executable. Create a new file called wyag in your text editor, and copy the following few lines:

首先创建一个(非常简短的)可执行文件。在文本编辑器中创建一个新文件,命名为 wyag,并复制以下几行:

python
#!/usr/bin/env python3

import libwyag
libwyag.main()

Then make it executable:

然后使其可执行:

shell
$ chmod +x wyag

you’re done!

完成了!

Now for the library. it must be called libwyag.py, and be in the same directory as the wyag executable. Begin by opening the empty libwyag.py in your text editor.

接下来是库文件。它必须命名为 libwyag.py,并与 wyag 可执行文件位于同一目录中。首先在文本编辑器中打开空的 libwyag.py 文件。

We’re first going to need a bunch of imports (just copy each import, or merge them all in a single line)

我们首先需要导入一些模块(可以逐一复制每个导入,或合并成一行):

  • Git is a CLI application, so we’ll need something to parse command-line arguments. Python provides a cool module called argparse that can do 99% of the job for us.

  • Git 是一个命令行应用程序,因此我们需要解析命令行参数的工具。Python 提供了一个很棒的模块名为 argparse,可以为我们完成 99% 的工作。

    python
    import argparse
  • We’ll need a few more container types than the base lib provides, most notably an OrderedDict. It’s in collections.

  • 我们还需要一些基本库中没有的容器类型,特别是 OrderedDict,它在 collections 中。

    python
    import collections
  • Git uses a configuration file format that is basically Microsoft’s INI format. The configparser module can read and write these files.

  • Git 使用的配置文件格式基本上是微软的 INI 格式。可以使用 configparser 模块读取和写入这些文件。

    python
    import configparser
  • We’ll be doing some date/time manipulation:

  • 我们还会进行一些日期/时间的操作:

    python
    from datetime import datetime
  • We’ll need, just once, to read the users/group database on Unix (grp is for groups, pwd for users). This is because git saves the numerical owner/group ID of files, and we’ll want to display that nicely (as text):

  • 需要一次性读取 Unix 的用户/组数据库(grp 用于组,pwd 用于用户)。这是因为 Git 保存文件的所有者/组 ID,我们希望将其以文本形式美观地显示出来:

    python
    import grp, pwd
  • To support .gitignore, we’ll need to match filenames against patterns like *.txt. Filename matching is in… fnmatch:

  • 为了支持 .gitignore,我们需要匹配如 *.txt 的文件名模式。文件名匹配功能在 fnmatch 中:

    python
    from fnmatch import fnmatch
  • Git uses the SHA-1 function quite extensively. In Python, it’s in hashlib.

  • Git 广泛使用 SHA-1 函数。在 Python 中,它位于 hashlib 中。

    python
    import hashlib
  • Just one function from math:

  • 只需要使用 math 中的一个函数:

    python
    from math import ceil
  • os and os.path provide some nice filesystem abstraction routines.

  • osos.path 提供了一些很好的文件系统抽象例程。

    python
    import os
  • we use just a bit of regular expressions:

  • 我们还需要使用一些正则表达式:

    python
    import re
  • We also need sys to access the actual command-line arguments (in sys.argv):

  • 另外需要 sys 来访问实际的命令行参数(在 sys.argv 中):

    python
    import sys
  • Git compresses everything using zlib. Python has that, too:

  • Git 使用 zlib 进行所有内容的压缩。Python 中也有 这个功能

    python
    import zlib

Imports are done. We’ll be working with command-line arguments a lot. Python provides a simple yet reasonably powerful parsing library, argparse. It’s a nice library, but its interface may not be the most intuitive ever; if need, refer to its documentation.

导入完成。我们将频繁处理命令行参数。Python 提供了一个简单但功能强大的解析库 argparse。这是一个不错的库,但其接口可能并不是最直观的;如果需要,可以参考其 文档

python
argparser = argparse.ArgumentParser(description="最简单的内容跟踪器")

We’ll need to handle subcommands (as in git: init, commit, etc.) In argparse slang, these are called “subparsers”. At this point we only need to declare that our CLI will use some, and that all invocation will actually require one — you don’t just call git, you call git COMMAND.

我们需要处理子命令(如 git 中的 initcommit 等)。在 argparse 的术语中,这些被称为“子解析器”。此时我们只需声明我们的 CLI 将使用子解析器,并且所有调用都必须包含一个——你不能只调用 git,而是要调用 git COMMAND

python
argsubparsers = argparser.add_subparsers(title="Commands", dest="command")
argsubparsers.required = True

The dest="command" argument states that the name of the chosen subparser will be returned as a string in a field called command. So we just need to read this string and call the correct function accordingly. By convention, I’ll call these functions “bridges functions” and prefix their names by cmd_. Bridge functions take the parsed arguments as their unique parameter, and are responsible for processing and validating them before executing the actual command.

dest="command" 参数表示所选择的子解析器的名称将作为字符串返回,存储在名为 command 的字段中。因此,我们只需读取这个字符串并相应地调用正确的函数。按照惯例,我将这些函数称为“桥接函数(bridges functions)”,并以 cmd_ 为前缀。桥接函数将解析的参数作为唯一参数,并负责处理和验证它们,然后执行实际命令。

python
def main(argv=sys.argv[1:]):
    args = argparser.parse_args(argv)
    match args.command:
        case "add"          : cmd_add(args)
        case "cat-file"     : cmd_cat_file(args)
        case "check-ignore" : cmd_check_ignore(args)
        case "checkout"     : cmd_checkout(args)
        case "commit"       : cmd_commit(args)
        case "hash-object"  : cmd_hash_object(args)
        case "init"         : cmd_init(args)
        case "log"          : cmd_log(args)
        case "ls-files"     : cmd_ls_files(args)
        case "ls-tree"      : cmd_ls_tree(args)
        case "rev-parse"    : cmd_rev_parse(args)
        case "rm"           : cmd_rm(args)
        case "show-ref"     : cmd_show_ref(args)
        case "status"       : cmd_status(args)
        case "tag"          : cmd_tag(args)
        case _              : print("无效命令。")