Skip to content

Commit

Permalink
Merge branch 'main' into dolma_taggers
Browse files Browse the repository at this point in the history
  • Loading branch information
peterbjorgensen authored Nov 15, 2023
2 parents 4ff1eee + 1988e05 commit ac4ebd2
Show file tree
Hide file tree
Showing 4 changed files with 59 additions and 2 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ For more information please check out the following links:
| | |
| ------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
| 📑 [**About**](https://centre-for-humanities-computing.github.io/danish-foundation-models/) | A overview of the DFM project |
| [**Research Paper**](inreview) | An paper introducing DFM and its rationale |
| [**Research Paper**](https://arxiv.org/abs/2311.07264) | An paper introducing DFM and its rationale |
| 🚀 [**Models**](https://centre-for-humanities-computing.github.io/danish-foundation-models/models_text/) | A overview of current models available through the DFM project |
| 💽 [**Datasets**](https://centre-for-humanities-computing.github.io/danish-foundation-models/dcc/) | Includes datasheets about the datasets which includes preprocessing, reason for constructions and more. |

Expand Down
57 changes: 57 additions & 0 deletions citation.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
cff-version: 1.2.0
title: Danish Foundation Models
message: >-
If you use this software or associated models, please cite
it as below.
type: software
authors:
- family-names: Enevoldsen
given-names: Kenneth
- family-names: Hansen
given-names: Lasse
- family-names: Nielsen
given-names: Dan S.
- family-names: Egebæk
given-names: Rasmus A. F.
- family-names: Holm
given-names: Søren V.
- family-names: Nielsen
given-names: Martin C.
- family-names: Bernstorff
given-names: Martin
- family-names: Larsen
given-names: Rasmus
- family-names: Jørgensen
given-names: Peter B.
- family-names: Højmark-Bertelsen
given-names: Malte
- family-names: Vahlstrup
given-names: Peter B.
- family-names: Møldrup-Dalum
given-names: Per
- family-names: Nielbo
given-names: Kristoffer
identifiers:
- type: url
value: "https://arxiv.org/abs/2311.07264"
repository-code: >-
https://github.com/centre-for-humanities-computing/danish-foundation-models
url: >-
https://centre-for-humanities-computing.github.io/danish-foundation-models/
abstract: >-
Large language models, sometimes referred to as foundation
models, have transformed multiple fields of research.
However, smaller languages risk falling behind due to high
training costs and small incentives for large companies to
train these models. To combat this, the Danish Foundation
Models project seeks to provide and maintain open,
well-documented, and high-quality foundation models for
the Danish language. This is achieved through broad
cooperation with public and private institutions, to
ensure high data quality and applicability of the trained
models. We present the motivation of the project, the
current status, and future perspectives.
keywords:
- Danish
- natural language processing
date-released: "2023-11-14"
Binary file modified docs/_static/structure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Welcome to the Danish Foundation Models (DFM) project, a pioneering initiative i
3. To maintain a high standard of **documentation** of models such as model cards \[[Mitchell et al., 2019](https://arxiv.org/abs/1810.03993)\] and datasheets \[[Gebru et al., 2021](https://cacm.acm.org/magazines/2021/12/256932-datasheets-for-datasets/abstract)\].
4. To **open-source** not only the models but also all components required for reproducibility such as pre-processing, training, and validation code.

You can read more about the argument for Danish Language models in our [publication](inreview).
You can read more about the argument for Danish Language models in our [publication](https://arxiv.org/abs/2311.07264).


## Open-source models on closed-source data
Expand Down

0 comments on commit ac4ebd2

Please sign in to comment.