Document Sync by Tina

ServerlessLLM · Oct 21, 2024 · effe441 · effe441
1 parent 75e30eb
commit effe441
Show file tree

Hide file tree

Showing 2 changed files with 43 additions and 20 deletions.
diff --git a/docs/stable/getting_started/installation.md b/docs/stable/getting_started/installation.md
@@ -9,35 +9,57 @@ sidebar_position: 0
 - Python: 3.10
 - GPU: compute capability 7.0 or higher
 
-## Install with pip
-TODO
+## Installing with pip
+```bash
+# On the head node
+conda create -n sllm python=3.10 -y
+conda activate sllm
+pip install serverless-llm
+pip install serverless-llm-store
+
+# On a worker node
+conda create -n sllm-worker python=3.10 -y
+conda activate sllm-worker
+pip install serverless-llm[worker]
+pip install serverless-llm-store
+```
+
+:::note
+If you plan to use vLLM with ServerlessLLM, you need to apply our patch to the vLLM repository. Refer to the [vLLM Patch](#vllm-patch) section for more details.
+:::
+
 
-## Install from source
-Install the package from source by running the following commands:
+## Installing from source
+To install the package from source, follow these steps:
 ```bash
 git clone https://github.com/ServerlessLLM/ServerlessLLM
 cd ServerlessLLM
 ```
 
 ```
-# head node
+# On the head node
 conda create -n sllm python=3.10 -y
 conda activate sllm
 pip install -e .
-pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ serverless_llm_store==0.0.1.dev5
+cd sllm_store && rm -rf build
+# Installing `sllm_store` from source can be slow. We recommend using pip install.
+pip install .
 
-# worker node
+# On a worker node
 conda create -n sllm-worker python=3.10 -y
 conda activate sllm-worker
 pip install -e ".[worker]"
-pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ serverless_llm_store==0.0.1.dev5
+cd sllm_store && rm -rf build
+# Installing `sllm_store` from source can be slow. We recommend using pip install.
+pip install .
 ```
 
 # vLLM Patch
-To use vLLM with ServerlessLLM, we need to apply our patch `serverless_llm/store/vllm_patch/sllm_load.patch` to the vLLM repository. Currently, the patch is only tested with vLLM version `0.5.0`.
+To use vLLM with ServerlessLLM, you need to apply our patch located at `sllm_store/vllm_patch/sllm_load.patch` to the vLLM repository. to the vLLM repository.
+The patch has been tested with vLLM version `0.5.0.post1`.
 
-You may do that by running our script:
+You can apply the patch by running the following script:
 ```bash
 conda activate sllm-worker
-./serverless_llm/store/vllm_patch/patch.sh
+./sllm_store/vllm_patch/patch.sh
 ```
diff --git a/docs/stable/store/quickstart.md b/docs/stable/store/quickstart.md
@@ -26,20 +26,21 @@ conda activate sllm-store
 
 ### Install with pip
 ```bash
-pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ serverless_llm_store==0.0.1.dev5
+pip install serverless-llm-store
 ```
 
 ### Install from source
 1. Clone the repository and enter the `store` directory
 
 ``` bash
 git clone git@github.com:ServerlessLLM/ServerlessLLM.git
-cd ServerlessLLM/serverless_llm/store
+cd ServerlessLLM/sllm_store
 ```
 
 2. Install the package from source
 
 ```bash
+rm -rf build
 pip install .
 ```
 
@@ -55,7 +56,7 @@ ln -s /mnt/nvme/models ./models
 
 1. Convert a model to ServerlessLLM format and save it to a local path:
 ```python
-from serverless_llm_store.transformers import save_model
+from sllm_store.transformers import save_model
 
 # Load a model from HuggingFace model hub.
 import torch
@@ -84,7 +85,7 @@ docker run -it --rm -v $PWD/models:/app/models checkpoint_store_server
 ```python
 import time
 import torch
-from serverless_llm_store.transformers import load_model
+from sllm_store.transformers import load_model
 
 # warm up the GPU
 num_gpus = torch.cuda.device_count()
@@ -110,19 +111,19 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ## Usage with vLLM
 
 :::tip
-To use ServerlessLLM as the load format for vLLM, you need to apply our patch `serverless_llm/store/vllm_patch/sllm_load.patch` to the installed vLLM library. Therefore, please ensure you have applied our `vLLM Patch` as instructed in [installation guide](../getting_started/installation.md).
+To use ServerlessLLM as the load format for vLLM, you need to apply our patch `sllm_store/vllm_patch/sllm_load.patch` to the installed vLLM library. Therefore, please ensure you have applied our `vLLM Patch` as instructed in [installation guide](../getting_started/installation.md).
 
 You may check the patch status by running the following command:
 ``` bash
-./serverless_llm/store/vllm_patch/check_patch.sh
+./sllm_store/vllm_patch/check_patch.sh
 ```
 If the patch is not applied, you can apply it by running the following command:
 ```bash
-./serverless_llm/store/vllm_patch/patch.sh
+./sllm_store/vllm_patch/patch.sh
 ```
 To remove the applied patch, you can run the following command:
 ```bash
-./serverless_llm/store/vllm_patch/remove_patch.sh
+./sllm_store/vllm_patch/remove_patch.sh
 ```
 :::
 
@@ -219,7 +220,7 @@ downloader = VllmModelDownloader()
 downloader.download_vllm_model("facebook/opt-1.3b", "float16", 1)
 ```
 
-After downloading the model, you can launch the checkpoint store server and load the model in vLLM through `serverless_llm` load format.
+After downloading the model, you can launch the checkpoint store server and load the model in vLLM through `sllm` load format.
 
 2. Launch the checkpoint store server in a separate process:
 ```bash