This project generates random text(lorem ipsum) based on real world frequency of Chinese characters. It reads character frequencies from a file and uses them to generate random sentences.
By default, it generates a random text with 3~5 paragraphs, 4~8 sentences per paragraph, and no more than 20 characters per sentence, which is about 300 characters.
But you can easily customize the number of paragraphs, sentences per paragraph, and characters per sentence by passing a different parameter. Besides, it's also available to generate a random text with a specific number of characters.
Chinese-lorem-ipsum/
├── website/ # Website
│ ├── api/ # Backend API
│ │ ├── apidoc.md # API Documentation
│ │ └── app.py # flask app
│ ├── index.html # Homepage
│ └── ... # Other website files
├── data/ # Data source
│ ├── word_freq.txt # Word frequency (generated during preprocessing)
│ ├── char_freq.txt # Character frequency (generated during preprocessing)
│ ├── 现代汉语语料库词频表 # Word frequency
│ └── 现代汉语语料库字频表 # Character frequency
├── train/ # Test version using Markov model
│ ├── clean_corpus.py # Script to clean the corpus
│ └── model.py # Markov model
├── model/ # Markov model file
│ └── model.json # Markov model
├── lorem_ipsum.py # Script to generate random text
├── preprocess.py # Script to preprocess the frequency table
└── ... # Other files
-
Clone the repository:
git clone https://github.com/SunnyCloudYang/Chinese-lorem-ipsum.git
-
Navigate to the project directory:
cd Chinese-lorem-ipsum
-
Set up a virtual environment:
python -m venv venv
-
Activate the virtual environment:
-
On Windows:
venv\Scripts\activate
-
On macOS/Linux:
source venv/bin/activate
-
-
Install the required dependencies:
pip install -r requirements.txt
-
Clone the repository:
git clone https://github.com/SunnyCloudYang/Chinese-lorem-ipsum.git
-
Navigate to the project directory:
cd Chinese-lorem-ipsum/website
-
Install the required dependencies:
npm install
-
Navigate to the website directory:
cd Chinese-lorem-ipsum/website
-
Run the Vercel development server:
vercel dev
-
Open your browser and navigate to
http://localhost:3000
. -
Click the "Generate" button to generate a random text.
-
Ensure you have the
word_freq.txt
file in the project directory. -
Run the script to generate a random text with 3~5 paragraphs and 4~8 sentences per paragraph:
python lorem_ipsum.py
-
Fork the repository.
-
Create a new branch:
git checkout -b feature/your-feature
-
Make your changes and commit them:
git commit -m "Add your message"
-
Push to the branch:
git push origin feature/your-feature
-
Open a pull request.
This project is licensed under the MIT License.