diff --git a/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/2023-11-15_cologne/2023-11-15_cologne.html b/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/2023-11-15_cologne/2023-11-15_cologne.html deleted file mode 100644 index db69b79f3..000000000 --- a/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/2023-11-15_cologne/2023-11-15_cologne.html +++ /dev/null @@ -1,1571 +0,0 @@ -CEPLAS Good Data Management Practices
-

The ARC Club

-
-

November 15th, 2023
-Dominik Brilhaus, CEPLAS Data Science

-
-
-

Get-to-know

-
    -
  • Lab
  • -
  • CEPLAS / TRR / MibiNet / ?
  • -
  • My favorite lab assay
  • -
  • Used code / programming language before
  • -
  • Have an ORCID
  • -
  • My motivation / expectation
  • -
-
-
-

House-keeping

-

Pad: https://pad.hhu.de/ZG0nGuRyRo2D0T0vHnkTpQ?view

-
    -
  • take notes
  • -
  • ask questions
  • -
  • copy / paste links, etc.
  • -
-
-
-

Materials

-

Slides will be shared via DataPLANT knowledge base

-
-
-

Tentative agenda

-
-
-

Day 1

-
    -
  • Intro to ARC and demo
  • -
  • ARCitect Hands-on
  • -
  • Create your ARCs
  • -
  • DataHUB Features
  • -
-
-
-

Day 2

-
    -
  • ARC recap session
  • -
  • ISA and Metadata
  • -
  • Swate Hands-on
  • -
  • Annotate data in your ARC
  • -
-
-
-
-
-

The ARC Club – Goals

-
    -
  • Move existing datasets into ARCs
  • -
  • Share them via the DataHUB
  • -
  • First few steps into ARCs
  • -
  • You or collaborators can pick them up from there
  • -
-
-
-

The ARC Club – Rules

-
    -
  1. You do not talk about ARC Club
  2. -
  3. You do not talk about ARC Club
  4. -
-
-
-

Rules: perfect is the enemy of good

-
    -
  • There is no perfect ARC
  • -
  • There is no complete ARC
  • -
  • The only bad ARCs are those that don't exist yet.
  • -
-
-

🚀 Let's get started, the rest is easy 🚀

-
-
-

Let's draw a typical lab workflow from your lab 📝

-
-
-

Resources

-

DataPLANT (nfdi4plants)

-

Website: https://nfdi4plants.org/
-Knowledge Base: https://nfdi4plants.org/nfdi4plants.knowledgebase/
-DataHUB: https://git.nfdi4plants.org

-

GitHub: https://github.com/nfdi4plants
-HelpDesk: https://helpdesk.nfdi4plants.org

-

💡 You can help us by raising issues, bugs, ideas...

-
-
-
-

Contributors

-

Slides presented here include contributions by

- -
-
-

The ARC ecosystem

- -

A FAIR RDM journey along a (mutable) data life cycle

-
-
-
-

Data Stewardship between DataPLANT and the community

-

-
-
-

The research data life cycle

-
-

-

https://rdmkit.elixir-europe.org, CC BY 4.0

-
-
-

Annotated Research Context (ARC)

-

-
-
-

What does an ARC look like?

-

-
-
-

ARCs store experimental data

-

-
-
-

Computations can be run inside ARCs

-

-
-
-

ARCs come with comprehensive metadata

-

-
-
-

ARC builds on standards

-

-

https://isa-tools.org/ | https://www.commonwl.org/
-https://www.researchobject.org/ro-crate/ | https://git-scm.com

-
-
-
-

Collect

-

-
-
-

Process (e.g. annotate)

- -

-
-
-

Analyse

- -

-

Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-

Preserve

- -

-

adapted from Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-

Preserve and publish

- -

-

Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-

Share and collaborate

-

-
-
-

Reuse

-

- -

Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-

Mutable data life cycle

-

- -

Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-

Plan – ARC scale

-

-

Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-

Plan – proposal scale

-

Zhou et al. (2023), DataPLAN: a web-based data management plan generator for the plant sciences, bioRxiv 2023.07.07.548147; doi: https://doi.org/10.1101/2023.07.07.548147

-

https://dmpg.nfdi4plants.org

-

-
-
-

The ARC ecosystem

-

-
-
-

The ARC ecosystem

-

-
-
-

The ARC ecosystem

-

-
-
-
-

Contributors

-

Slides presented here include contributions by

- -
-
-

ARCitect hands-on

-
-
-

ARCitect installation

-

Please install version v0.0.10 of the ARCitect: https://github.com/nfdi4plants/ARCitect/releases/tag/v0.0.10

-

🔥 (released September 20th, 2023) 🔥

-
-
-

Download the demo data

-

https://nfdi4plant.sharepoint.com/:f:/s/Teaching/Eik7k-oJiMREgZ24kto7sIYBGxHmmZlS_Kzf7psk-5w-xg?e=u0sADd

-
-
-

Sort Demo data in an ARC

- -

-
-
-

Open ARCitect

-
    -
  1. Login to DataHUB (1)
    -
  2. -
-
-
-

Initiate the ARC folder structure

- -
    -
  1. Create a New ARC (2)
  2. -
  3. Select a location and name it TalinumPhotosynthesis
  4. -
-
-
-

Your ARC's name

- -

💡 By default, your ARC's name will be used

-
    -
  • for the ARC folder on your machine
  • -
  • to create your ARC in the DataHUB at https://git.nfdi4plants.org/<YourUserName>/<YourARC> (see next steps)
  • -
  • as the identifier for your investigation
  • -
-

💡 Make sure that no ARC exists at https://git.nfdi4plants.org/<YourUserName>/<YourARC>. Otherwise you will sync to that ARC.

-

💡Don't use spaces in ARC's name

-
-
-

Add a description to your investigation

-

-
-
-

Add a contributor

-
-
-

Add a study

-

by clicking "Add Study" and entering an identifier for your study

-

Use talinum_drought as an identifier

-
-
-

Study panel

-

In the study panel you can add

-
    -
  • general metadata,
  • -
  • people, and
  • -
  • publications
  • -
  • data process information
  • -
-
-
-

Add an assay

-

by clicking "Add Assay" and entering an identifier for your assay

-

Add two assays with rnaseq and metabolomics as an identifier
-

-
-
-

Link your assay to a study

-

You can either

-
    -
  • link your new assay to an existing study in your ARC or
  • -
  • create a new one
  • -
-

Link your assays to your talinum_drought study

-
-
-

Add information about your assay

-

In the assay panel you can

-
    -
  1. link or unlink the assay to studies, and
  2. -
  3. define the assay's -
      -
    • measurement type
    • -
    • technology type, and
    • -
    • technology platform.
    • -
    -
  4. -
  5. add data process information
  6. -
-
-
-

Add protocols

-

You can either

-
    -
  • directly write a new protocol within the ARCitect or
  • -
  • import an existing one from your computer
  • -
-

-
-
-

Add protocols and datasets

-

In the file tree you can

-
    -
  • add a dataset and
  • -
  • protocols associated to that dataset.
  • -
-

💡 Add Dataset allows to import data from any location on your computer into the ARC.

-

⚠️ Depending on the file size, this may take a while. Test this with a small batch of files first.

-
-
-

Sort Demo Data to your ARC

-

💡 protocols can directly be imported via ARCitect

-

💡 to add multiple datasets folders, they have to be added manually via file browser

-
-
-

Login to the DataHUB

-

Click Login (1) in the sidebar to login to the DataHUB.

-

💡 This automatically opens your browser at the DataHUB (https://git.nfdi4plants.org) and asks you to login, if you are not already logged in.

-
-
-

Upload your local ARC to the DataHUB

-

From the sidebar, navigate to Versions (6)

-
-
-

Versions

-

The versions panel allows you to

-
    -
  • store the local changes to your ARC in form of "commits",
  • -
  • sync the changes to the DataHUB, and
  • -
  • check the history of your ARC
  • -
-
-
-

Connection to the DataHUB

-

If you are logged in, the versions panel shows

-
    -
  • your DataHUB's Full Name and eMail
  • -
  • the URL of the current ARC in the DataHUB https://git.nfdi4plants.org/<YourUserName>/<YourARC>
  • -
-
-
-

Check if your ARC is successfully uploaded

-
    -
  1. sign in to the DataHUB
  2. -
  3. Check your projects
  4. -
-
-
-

Your ARC is ready

- -

👩‍💻 Initiated an ARC
-

-📂 Structured and ...
-

- ... annotated experimental data
-

-🌐 Shared with collaborators

-
-
-
-

Received two emails from "GitLab" about a failed pipeline?

-

-

🔥 Don't worry 😄

-
-
-

Pipeline Failed

- -
    -
  • -

    a "continuous quality control" (CQC) pipeline validates your ARC

    -
  • -
  • -

    This fails if one of the following metadata items is missing:

    -
    Investigation Identifier
    -Investigation Title
    -Investigation Description
    -Investigation Person Last Name
    -Investigation Person First Name
    -Investigation Person Email
    -Investigation Person Affiliation
    -
    -
  • -
-
-
-

Pipeline Failed

-

If the pipeline has failed once, it is disabled by default

-
-
-

Reactivate the CQC pipeline

- -

To reactivate it and let the DataHUB validate your ARC again:

-
    -
  1. navigate to CI/CD setting <arc-url>/-/settings/ci_cd
  2. -
  3. expand "Auto DevOps"
  4. -
  5. check box "Default to Auto DevOps pipeline"
  6. -
  7. Save changes
  8. -
-
-
-
-

Contributors

-

Slides presented here include contributions by

- -
-
-

DataPLANT DataHUB

-
-
-

ARC builds on standards + Git

-

-
-
-

The DataPLANT DataHUB – a GitLab Plus

-

-
-
-
-
-
-
-
-
-
-

Mutable data life cycle

-

- -

Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-

Project management

-

-

Weil, H.L., Schneider, K., et al. (2023), PLANTdataHUB: a collaborative platform for continuous FAIR data sharing in plant research. Plant J. https://doi.org/10.1111/tpj.16474

-
-
-
-

Contributors

-

Slides presented here include contributions by

- -
-
-

DataHub Hands-On

-
-
-

Navigation Bar

-

-
    -
  1. navigate directly to the projects panel via the icon in the top-left (1)
  2. -
  3. open the hamburger Menu (2)
  4. -
  5. use the search field (3) to find ARCs, users and groups
  6. -
  7. open the avatar Menu (4)
  8. -
-
-
-

Hamburger Menu

-
    -
  1. From the hamburger menu (1) you can
  2. -
  3. navigate to the projects (2)
  4. -
  5. or groups (3) panels
  6. -
-
-
-

Avatar Menu

-
    -
  1. In the avatar menu (1) you can
  2. -
  3. find your profile name and user name (2),
  4. -
  5. navigate to the user settings (3)
  6. -
  7. or sign out (4) of the DataHUB.
  8. -
-
-
-

Projects Panel

-

-
    -
  1. Choose a tab (1) to see only your ARCs, or explore other publicly available ARCs.
  2. -
  3. The main panel (2) lists all ARCs
  4. -
  5. Here you can also see, the visibility level (3), and
  6. -
  7. your permission or role (4) for the listed ARC.
  8. -
  9. You can create a New Project in the top-right corner (5).
  10. -
-
-
-

ARC Panel

-

The ARC Panel is the main working area for your ARC.

-

-
-
-

ARC Panel – sidebar

- -
    -
  1. access the project information (1), e.g. invite members to the ARC
  2. -
  3. follow the progress of your ARC repository (2),
  4. -
  5. organize tasks in issue lists and boards (3),
  6. -
  7. take notes in a wiki to your ARC (4),
  8. -
  9. adapt the settings (5) of the ARC.
  10. -
-
-
-

ARC Panel – main panel

- -
    -
  1. see the ARC's name and visibility level (6),
  2. -
  3. follow the ARC's commit history (7),
  4. -
  5. see files contained in your ARC just like on your computer (8),
  6. -
  7. add new files or directories (9), and
  8. -
  9. download or clone your ARC (10).
  10. -
-
-
-

Collaborate and share

-

-
-
-

Invite collaborators

-
    -
  • Unless changed, your ARC is set to private by default.
  • -
  • To collaborate, you can invite lab colleagues or project partners to your ARC by following the steps on the subsequent slides.
  • -
  • To get started sign in to the DataHUB and open the ARC you want to share.
  • -
-
-
-
    -
  1. Click on Project Information in the left navigation panel
  2. -
-

fit

-
-
-
    -
  1. Click on Members
  2. -
-

fit

-
-
-
    -
  1. Click on Invite members
  2. -
-

fit

-
-
-
    -
  1. Search for potential collaborators
  2. -
-

fit

-
-
-
    -
  1. Select a role
  2. -
-

fit

-
-
-

Choosing the proper role

- -

Guests
-Have the least rights. They will not be able to see the content of your ARC (only the wiki page).

-

Reporters
-Have read access to your ARC. This is recommended for people you ask for consultancy.

-

Developers
-The choice for most people you want to invite to your ARC. Developers have read and write access, but cannot maintain the project on the DataHUB, e.g. inviting others.

-

Maintainers
-Gives the person the same rights as you have (except of removing you from your own project). This is recommended for inviting PIs or group leaders allowing them to add their group members for data upload or analysis to the project as well.

-

A detailed list of all permissions for the individual roles can be found here

-
-
-

Congratulations!

-
You have just shared your ARC with a collaborator.
- -

-
-
-

Version control

-
    -
  • Commit history
  • -
-
-
-

Project Management

-
    -
  • Issues
  • -
-
-
-

ARCs come with their own wiki space

-
    -
  • directly associated to your ARC
  • -
  • same access rights as your ARC
  • -
  • share meeting minutes or ideas with collaboration partners
  • -
  • keep ARC clean of files that are not considered "research data"
  • -
-
-
-
-

Contributors

-

Slides presented here include contributions by

- -
-
-

Metadata and ISA

-
-
-

What is
metadata?

-
-
-

Viola's PhD Project

-

Exercise: Take 5 minutes to note down the metadata

- -

Viola investigates the effect of the plant circadian clock on sugar metabolism in W. mirabilis. For her PhD project, which is part of an EU-funded consortium in Prof. Beetroot's lab, she acquires seeds from a South-African botanical society. Viola grows the plants under different light regimes, harvests leaves from a two-day time series experiment, extracts polar metabolites as well as RNA and submits the samples to nearby core facilities for metabolomics and transcriptomics measurements, respectively. After a few weeks of iterative consultation with the facilities' heads as well as technicians and computational biologists involved, Viola receives back a wealth of raw and processed data. From the data she produces figures and wraps everything up to publish the results in the Journal of Wonderful Plant Sciences.

-
-
-

Metadata everywhere

- -

Viola investigates the effect of the plant circadian clock on sugar metabolism in W. mirabilis. For her PhD project, which is part of an EU-funded consortium in Prof. Beetroot's lab, she acquires seeds from a South-African botanical society. Viola grows the plants under different light regimes, harvests leaves from a two-day time series experiment, extracts polar metabolites as well as RNA and submits the samples to nearby core facilities for metabolomics and transcriptomics measurements, respectively. After a few weeks of iterative consultation with the facilities' heads as well as technicians and computational biologists involved, Viola receives back a wealth of raw and processed data. From the data she produces figures and wraps everything up to publish the results in the Journal of Wonderful Plant Sciences.

-
-
-

Project metadata

-
-
-

project design

-
    -
  • researcher
  • -
  • institute and project
  • -
  • biological context
  • -
  • research question
  • -
  • purpose of data collection
  • -
  • ...
  • -
-
-
-

experimental processes

-
    -
  • origin and nature of the biological material
  • -
  • lab protocols
  • -
  • instrument model
  • -
  • ...
  • -
-
-
-

data-analytical processes

-
    -
  • algorithms
  • -
  • tools
  • -
  • software versions and dependencies employed
  • -
  • ...
  • -
-
-
-
-
-

Other types of metadata

-
-
-

bibliographic

-
    -
  • Title
  • -
  • Publication date and title
  • -
  • Description
  • -
  • Author
  • -
  • Contacts
  • -
  • Keywords
  • -
  • ...
  • -
-
-
- -
    -
  • data origin, ownership, rovenance,
  • -
  • licensing
  • -
  • ethical aspects
  • -
  • ...
  • -
-
-
-

technical

-
    -
  • expected data volume
  • -
  • storage location
  • -
  • file formats
  • -
  • ...
  • -
-
-
-
-
-

Metadata from a FAIR perspective

-
-
-

Findable

-
    -
  • metadata names the content of the data
  • -
  • basis for search engines
  • -
  • makes it categorizable for people and machines
  • -
-

Accessible

-
    -
  • information about origin
  • -
  • location of storage
  • -
  • access rights
  • -
-
-
-

Interoperable

-
    -
  • metadata identifies software and file formats
  • -
  • required conversions between file formats
  • -
-

Reusable

-
    -
  • obtain and reuse research data according to clear rules described in licenses
  • -
-
-
-
-
-

Metadata "Standards"

-

Examples from Minimum Information for Biological and Biomedical Investigations (MIBBI):

- -

💡 Check out https://fairsharing.org/ for more examples

-
-
-

Metadata standards ≈ Checklists

-
    -
  • Determine (minimal) required information
  • -
  • Usually do not determine the format (i.e. shape or file type)
  • -
-
-
-

A small Interactive detour

-

-> favorite Movie

-
-
-

How does google "know"?!

-

-
-
-

Schemas and machine-readability

-
-
-

Structured data and the internet

-

Schema.org

-
    -
  • create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, ...
  • -
  • Structured data can be used to mark up all kinds of items from products to events to recipes
  • -
  • Communicate with search engines (-> SEO, search engine optimization)
  • -
  • Enhance findability from search engine results
  • -
  • Provide context to an ambigous webpage
  • -
  • Metadata interoperability and standardization across all website using schema.org
  • -
-
-
-

Structured data and the internet: Schema.org

- -

https://schema.org/Person

-
<script type="application/ld+json">
-{
-  "@context": "https://schema.org",
-  "@type": "Person",
-  "address": {
-    "@type": "PostalAddress",
-    "addressLocality": "Seattle",
-    "addressRegion": "WA",
-    "postalCode": "98052",
-    "streetAddress": "20341 Whitworth Institute 405 N. Whitworth"
-  },
-  "colleague": [
-    "http://www.xyz.edu/students/alicejones.html",
-    "http://www.xyz.edu/students/bobsmith.html"
-  ],
-  "email": "mailto:jane-doe@xyz.edu",
-  "image": "janedoe.jpg",
-  "jobTitle": "Professor",
-  "name": "Jane Doe",
-  "telephone": "(425) 123-4567",
-  "url": "http://www.janedoe.com"
-}
-</script>
-
-
-
-

JSON-LD

- -

JSON-LD = JavaScript Object Notation for Linked Data

-
<script type="application/ld+json">
-  {
-    "@context": "https://schema.org",
-    "@type": "SportsTeam",
-    "name": "San Francisco 49ers",
-    "member": {
-      "@type": "OrganizationRole",
-      "member": {
-        "@type": "Person",
-        "name": "Joe Montana"
-      },
-      "startDate": "1979",
-      "endDate": "1992",
-      "roleName": "Quarterback"
-    }
-  }
-</script>
-
-
-
-

RDFa

-

RDFa = Resource Description Framework in Attributes

-
<div vocab="http://schema.org/" typeof="SportsTeam">
-  <span property="name">San Francisco 49ers</span>
-  <div property="member" typeof="OrganizationRole">
-    <div property="member" typeof="http://schema.org/Person">
-      <span property="name">Joe Montana</span>
-    </div>
-    <span property="startDate">1979</span>
-    <span property="endDate">1992</span>
-    <span property="roleName">Quarterback</span>
-  </div>
-</div>
-
-
-
-

Standards

-

Dublin Core

-

https://www.dublincore.org/schemas/

-

DataCite Schema

- -
-
-

DataCite Schema: Simple Example

-
...
-  <identifier identifierType="DOI">10.5072/D3P26Q35R-Test</identifier>
-  <creators>
-    <creator>
-      <creatorName nameType="Personal">Fosmire, Michael</creatorName>
-      <givenName>Michael</givenName>
-      <familyName>Fosmire</familyName>
-    </creator>
-    <creator>
-      <creatorName nameType="Personal">Wertz, Ruth</creatorName>
-      <givenName>Ruth</givenName>
-      <familyName>Wertz</familyName>
-    </creator>
-    <creator>
-      <creatorName nameType="Personal">Purzer, Senay</creatorName>
-      <givenName>Senay</givenName>
-      <familyName>Purzer</familyName>
-    </creator>
-  </creators>
-  <titles>
-    <title xml:lang="en">Critical Engineering Literacy Test (CELT)</title>
-  </titles>
-  <publisher xml:lang="en">Purdue University Research Repository (PURR)</publisher>
-  <publicationYear>2013</publicationYear>
-  <subjects>
-    <subject xml:lang="en">Assessment</subject>
-    <subject xml:lang="en">Information Literacy</subject>
-    <subject xml:lang="en">Engineering</subject>
-    <subject xml:lang="en">Undergraduate Students</subject>
-    <subject xml:lang="en">CELT</subject>
-    <subject xml:lang="en">Purdue University</subject>
-  </subjects>
-  <language>en</language>
-  <resourceType resourceTypeGeneral="Dataset">Dataset</resourceType>
-...
-
-

https://schema.datacite.org/meta/kernel-4.3/example/datacite-example-dataset-v4.xml

-
-
-

Ontologies

-
-
-

Ontology

-

(Sometimes also referred to "semantic model")

-

An ontology combines features of

-
    -
  • a dictionary,
  • -
  • a taxonomy, and
  • -
  • a thesaurus
  • -
-
-
-

Dictionary

-

Alphabetically lists terms and their definitions
-

-

Pizza: "a dish made typically of flattened bread dough spread with a savory mixture usually including tomatoes and cheese and often other toppings and baked"

-
-
-

Taxonomy

-

Hierarchy or classification

-
-
-

Thesaurus

-

Dictionary of synonyms and relations
-

-

Pizza ≈ Lahmacun ≈ Focaccia ≈ Flammkuchen

-
-
-

Ontology

-
    -
  • Structures a set of concepts in a particular area and the relations between them in a graph-like manner
  • -
  • Can be used in disambiguation, defining hierarchies, a standard to define terms
  • -
  • Define a common vocabulary of concepts and their relationships to model a particular domain while making it machine understandable
  • -
-
-
-

The semantic triple

-

-
-
-

Modeling a pizza menu

-

-
-
-

Modeling a pizza menu

-

-
-
-

Modeling a pizza menu

-

-
-
-

Predicates have two directions

-

-
-
-

Looking at the menu from a different perspective

-

An object of one triplet can be the subject to another

-

-
-
-

(Towards) a knowledge graph

-

-
-
-

Searching the menu

-

An ontology can be queried:

-
    -
  • "name all pizzas with topping mushrooms"
  • -
-
-
-

The Pizza Ontology

- -
-
-

Example ontologies

-

EDAM ontology

- -

PECO ontology

- -
-

Explore more examples

- -
-
-
-

ARC builds on ISA

-

-

https://isa-tools.org/format/specification.html

-
-
-

ARC builds on ISA

-

-
-
-

isa.<>.xlsx files within ARCs

-

-
-
-

Study and assay files are registered in the investigation file

-

-
-
-

The output of a study or assay file can function as input for a new isa.assay.xlsx

-

Output building blocks:

-
    -
  • Sample Name
  • -
  • Raw Data File
  • -
  • Derived Data File
  • -
-
-
-

-
-
-

Swate

-
-
-

Annotation by flattening the knowledge graph

-

-
    -
  • Low-friction metadata annotation
  • -
  • Familiar spreadsheet, row/column-based environment
  • -
-
-
-

Annotation principle

- -

-
    -
  • Low-friction metadata annotation
  • -
  • Familiar spreadsheet, row/column-based environment
  • -
-
-
-

Adding new building blocks (columns)

-

-
    -
  • Swate can be used for the annotation of isa.study.xlsx and isa.assay.xlsx files
  • -
-
-
-

Annotation Building Block types

- -
    -
  • Source Name (Input)
  • -
  • Protocol Columns -
      -
    • Protocol Type, Protocol Ref
    • -
    -
  • -
  • Characteristic
  • -
  • Parameter
  • -
  • Factor
  • -
  • Component
  • -
  • Output Columns -
      -
    • Sample Name, Raw Data File, Derived Data File
    • -
    -
  • -
-

Let's take a detour on Annotation Principles | slides

-
-
-

Ontology term search

- -

-

Enable related term directed search to directly fill cells with child terms

-
-
-

Fill your table with ontology terms

-

-
-
-

Hierarchical combination of ontologies

-

-
-
-

Swate templates

-
-
-

Checklists and Templates

-

-

Metadata standards or repository requirements can be represented as templates

-
-
-

Realization of lab-specific metadata templates

-

-

Facilities can define their most common workflows as templates

-
-
-

Directly import templates via Swate

-
    -
  • DataPLANT curated
  • -
  • Community templates
  • -
-
-
-
-

Contributors

-

Slides presented here include contributions by

- -
-
-

Swate hands-on

-
-
-

Goals

-
    -
  • Get familiar with ISA metadata and Swate
  • -
  • Annotate data in your ARC
  • -
-
-
-

Check Swate installation

-

☑️ Make sure Swate is installed:

-
    -
  1. Open Excel (online or Desktop)
  2. -
  3. Go to the Insert tab: Click the arrow next to "My Add-ins". There you should be able to select Swate.
  4. -
  5. Go to the Data tab: you should see the Swate (Core) add-in.
  6. -
-

💡 Alternatively, you can use Swate standalone
-(⚠️ this is however work in progress and likely to change)

-
-
-

Have a simple text editor ready

-
    -
  • Windows Notepad
  • -
  • MacOS TextEdit
  • -
-

Recommended text editor with code highlighting, git support, terminal, etc: Visual Studio Code

-
-
-

Download the demo data

- -
    -
  1. Open the ARCitect
  2. -
  3. Login (1) to your DataHUB account
  4. -
  5. Navigate to Download ARC (4)
  6. -
-
-
-

Download the demo data

-
    -
  1. Search for Talinum-CAM-Photosynthesis
  2. -
  3. Click the download button, select a location and open the ARC.
  4. -
-

-

💡 This is basically the ARC we created last session.

-
-
-

Where we left off last time

-

👩‍💻 Initiated an ARC
-📂 Structured and ...
-🌐 Shared with collaborators

-
-

Today we want to

-

... annotate the experimental data

-
-
-

Swate hands-on with demo data

-
-
-

Swate Overview

-

-
-
-

Let's annotate the plant samples first

-
    -
  1. Navigate to the demo ARC.
  2. -
  3. Open the lab notes studies/talinum_drought/protocols/plant_material.txt in a text editor.
  4. -
  5. Open the empty studies/talinum_drought/isa.study.xlsx workbook in Excel.
  6. -
-
-
-

Create an annotation table

-
-
-
-

Create a Swate annotation table via the create annotation table button in the yellow pop-up box OR click the Create Annotation Table quick access button.

-
-
-

💡 Each table is by default created with one input (Source Name) and one output (Sample Name) column

-
-
-

💡 Only one annotation table can be added per Excel sheet

-
-
-
- -
-
-
-
-

Add a building block

-
    -
  1. Navigate to the Building Blocks tab via the navbar. Here you can add Building Blocks to the table.
  2. -
  3. Instead of Parameter select Characteristic from the drop-down menu (A)
  4. -
  5. Search for organism in the search bar (B). This search looks for suitable Terms in our Ontology database.
  6. -
  7. Select the Term with the id OBI:0100026 and,
  8. -
  9. Click Add building block.
  10. -
-
-

💡 This adds three columns to your table, one visible and two hidden.

-
-
-
-

Insert values to annotate your data

-
    -
  1. Navigate to the Terms tab in the Navbar
  2. -
  3. In the annotation table, select any number of cells below Characteristic [organism]
  4. -
  5. Click into the search field in Swate.
  6. -
-
-

💡 You should see organism showing in a field in front of the search field
-💡 The search will now yield results related to organism

-
-
    -
  1. In the search field, search for "Talinum fruticosum"
  2. -
  3. Select the first hit and click Fill selected cells with this term
  4. -
-
-
-

Add a building block with a unit

-
    -
  1. In the Building Blocks tab, select Parameter, search for light intensity exposure and select the term with id PECO:0007224.
  2. -
  3. Check the box for This Parameter has a unit and search for microeinstein per square meter per second in the adjacent search bar.
  4. -
  5. Select UO:0000160.
  6. -
  7. Click Add building block.
  8. -
-
-

💡 This adds four columns to your table, one visible and three hidden.

-
-
-
-

Insert unit-values to annotate your data

-

In the annotation table, select any cell below Parameter [light intensity exposure] and add "425" as light intensity.

-
-

💡 You can see the numbers being complemented with the chosen unit, e.g. 425.00 microeinstein per square meter per second

-
-
-
-

Showing ontology reference columns

-

Hold Ctrl and click the Autoformat Table quick access button to adjust column widths and un-hide all hidden columns.

-
-

💡 You can see that your organism of choice was added with id and source Ontology in the reference (hidden) columns.
-⚠️ This feature is currently not supported on MacOS

-
-
-
-

Update ontology reference columns

-

Click the Update Ontology Terms quick access buttons.

-
-

💡 This updates all reference columns according to the main column. In this case the reference columns for Parameter [light intensity exposure] are updated with the id and source ontology of the microeinstein per square meter per second unit.

-
-
-
-

Your ISA table is growing

-

At this point. Your table should look similar to this:

-

-
-
-

Hiding ontology reference columns

-

Click the Autoformat Table quick access button without holding Ctrl to hide all reference columns.

-
-
-

Exercise 📝

-

Try to add suitable building blocks for other pieces of metadata from the plant growth protocol (studies/talinum_drought/protocols/plant_material.txt).

-
-
-

Let's annotate the RNA Seq data

-
    -
  1. Navigate to the demo ARC.
  2. -
  3. Open the lab notes assays/rnaseq/protocols/RNA_extraction.txt in a text editor.
  4. -
  5. Open the empty assays/rnaseq/isa.assay.xlsx) workbook in Excel.
  6. -
-
-
-

Use a template

-
    -
  1. Navigate to Templates in the Navbar and click Browse database in the first function block.
  2. -
-
-

💡 Here you can find community created workflow annotation templates

-
-
    -
  1. Search for RNA extraction and click select -
      -
    • You will see a preview of all building blocks which are part of this template.
    • -
    -
  2. -
  3. Click Add template to add all Building Blocks from the template to your table, which do not exist yet.
  4. -
-
-
-

Adding / Updating unit references

-

Sometimes you need to add or update the unit of an existing building block.

-
    -
  1. Select any number of rows of the Parameter [biosource amount] building block to mark it for the next steps.
  2. -
  3. Open the Building Blocks tab
  4. -
  5. In the bottom panel "Add/Update unit reference to existing building block", search for the unit "milligram". Select the unit term and click Update unit for cells.
    -💡 If you already had values in the main column they will be updated automatically.
  6. -
  7. Click the Update Ontology Terms quick access button, to update the reference columns.
  8. -
-
-
-

Remove building blocks

-

If there are any Building Blocks which do not fit your experiment you can use the Remove Building Block quick access button to remove it including all related (hidden) reference columns.

-

⚠️ Due to the hidden reference columns, we recommend not to delete table columns via usual Excel functions.

-
-
-

New process, new worksheet

-
    -
  1. Add a new sheet to the assays/rnaseq/isa.assay.xlsx) workbook.
  2. -
  3. Add the template "RNASeq Assay"
  4. -
-
-
-

Exercise 📝

-

Try to fill the two sheets with the protocol details:

-
    -
  • assays/rnaseq/protocols/RNA_extraction.txt and
  • -
  • assays/rnaseq/protocols/Illumina_libraries.txt
  • -
-
-
-

Your ISA table is ready 🎉

-

Go ahead, adjust the Building Blocks you want to use to describe your experiment as you see fit.
-Insert values using Swate Term search and add input and output.

-
-
-

A small detour on "Excel Tables"

-

Swate uses Excel's "table" feature to annotate workflows. Each table represents one process from input (e.g. plant leaf material) to output (e.g. leaf extract).

-

Example workflows with three processes each:

-
    -
  • Plant growth → sampling → extraction
  • -
  • Measured data files → statistical analysis → result files
  • -
-
-

💡 Excel tables allow to group data that belongs together inside one sheet. This is not to be confused with a (work)sheet or workbook.

-
workbook              (e.g. "isa.assay.xlsx")
- └─── worksheet       (e.g. "plant_growth")
-          └─── table  (e.g. "annotationTable")
-
-
-
-
-

🚧 Known issues with ARCitect and Swate (Nov 2023)

-
    -
  1. Annotation within ARCitect is not yet available.
  2. -
  3. Swate and ARCitect handle isa.study.xlsx / isa.assay.xlsx files differently.
  4. -
-
-
-
-

Contributors

-

Slides presented here include contributions by

- -
-
-

The ARC Club

-

Happy ARCing

-
-
-

Moving from FileShare to DataHUB – via ARCs

-

-
-
-

Assign projects

-
-
-

Rough routine for each project

-
    -
  1. Identify the available data and resources
  2. -
  3. Create the ARC
  4. -
  5. Add metadata and data
  6. -
  7. Share via DataHUB group
  8. -
-
-
-

Low(er) hanging fruits: published projects

-
    -
  1. Add the authors
  2. -
  3. Add the publication(s) -
      -
    1. Add citation and DOI
    2. -
    3. Add supplemental
    4. -
    5. Convert M&M to protocols
    6. -
    -
  4. -
  5. Reference data in public repositories
  6. -
  7. Add large data (e.g. from file share)
  8. -
  9. Set ARC to public!
  10. -
-
-
-

More challenging ARCs

-
    -
  • (unpublished) left-overs of colleagues who have since moved
  • -
  • unclear status
  • -
-
-
-

Collect / derive as much info about the investigation as possible

-

MUST haves

-
Investigation Identifier
-Investigation Title
-...
-Investigation Publication Status
-...
-Investigation Person Last Name
-Investigation Person First Name
-
-

💡 This and more investigation-level info can be collected in the ARC's isa.investigation.xlsx

-
-
-

Create and share the ARC

-
arc init
-arc sync -f -r https://git.nfdi4plants.org/<GroupName>/<InvestigationID>
-
-
-
-

Copy data

-
    -
  1. Copy data to the ARC, do not move data from original source
    -(we'll take care of that later)
  2. -
  3. Ideally use rsync rather than copying manually
  4. -
  5. Ideally use md5 or md5sum to check for correct file transfer
  6. -
-

💡 Ask the coders for help!

-
-
-

Perspective and administration in the future

-
-
-

Administration / Backup

-

Alt text

-
-

Source to slide(s)

./../../../../img/10-WelcomeIntro.md

- create study folder - - take a picture (add more demo pictures) -- create assay folder - - add fastq data

- annotate plant samples -- annotate rnaseq extraction

- run fastqc -- receive back results

1. Validation: CQC on each DataHUB commit -2. Publication: DOI

1. Validation: CQC on each DataHUB commit -2. Publication: DOI

- via ARC https://arcregistry.nfdi4plants.org/arcsearch -- via ISA https://arcregistry.nfdi4plants.org/isasearch

- Invite other (demo) account -- add notes from there

- **ARCitect**: Create empty ARC - - description - - author - - first name - - last name - - email - - **ARCitect**: Upload ARC to DataHUB - - **DataHUB** - - Discuss, collect meeting minutes in Wiki - - design / plant investigation (datahub wiki, issues)

Source to slide(s)

./../../../../img/20-ARC-ecosystem-demo.md

Demo dataset cannot be added via add dataset. Only individual files can be added, not multiple folders

Source to slide(s)

./../../../../img/26-ARCitect-HandsOn.md

- Invite other (demo) account -- add notes from there

Source to slide(s)

./../../../../img/42-DataHUB.md

Source to slide(s)

./../../../../img/43-DataHUB-HandsOn.md

Exercise: Association map - -Online: Let participants annotate (via video conference tool) -Presence: Draw map on (white) board

- let participant name a movie -- how do you find out the actors, director, release year, etc.? -- => google.com -- google movie - see knowledge graph to the right - - how does google know all that?! -- ===> schema.org

TODO: -- This is actually not a proper ontology(!), but rather a knowledge graph (= ontology + data)

LIVE-Demo -- Search an "interesting" term from PECO in browser (EBI OLS) - - Example: - - plant exposure - abiotic plant exposure - physical plant exposure - water environment exposure - drought environment exposure -- Show the graph view (and expand it interactively) -- Mention that terms (subjects, objects) and properties (predicates) have "URIs", "PIDs" -- Show that terms can have alternative / external IDs and link to "outdated" ontologies

<style scoped> -section p img{ - /* padding-left: 230px */ -} -</style>

combination of ISA (Characteristics, Parameter, Factor) and a biological or technological ontology (e.g. temperature, strain, instrument model) gives the flexibility to display an ontology term, e.g. temperature, as a regular process parameter or as the factor your study is based on (Parameter \[temperature\] or Factor \[temperature\]).

Source to slide(s)

./../../../../img/60-MetadataISA.md

Source to slide(s)

./../../../../img/70-Swate-HandsOn.md

Source to slide(s)

./../../../../img/90-ARCing.md

\ No newline at end of file diff --git a/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/70-Swate-HandsOn.html b/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/70-Swate-HandsOn.html index 2369689d3..58ae43980 100644 --- a/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/70-Swate-HandsOn.html +++ b/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/70-Swate-HandsOn.html @@ -1,4 +1,4 @@ -Swate hands-on
-

Swate hands-on

+/* Styling page number */div#\:\$p>svg>foreignObject>section:after{font-family:Calibri,sans-serif;font-size:20px;color:#2D3E50}div#\:\$p>svg>foreignObject>section:after{--marpit-root-font-size:20px}div#\:\$p>svg>foreignObject>section ul{margin-left:0px;padding-left:5}div#\:\$p>svg>foreignObject>section .profile-picture{position:relative;overflow:hidden;border-radius:50%;border-color:black;height:75px;width:75px;margin:10px;display:block}div#\:\$p>svg>foreignObject>section[data-marpit-scope-Swfldi8A] .columns{ + /* grid-template-columns: repeat(2, minmax(0, 1fr)); */grid-template-columns:500px 500px;gap:30px;display:flex;justify-content:center}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background]{columns:initial!important;display:block!important;padding:0!important}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background]:after,div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background]:before,div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=content]:after,div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=content]:before{display:none!important}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background]>div[data-marpit-advanced-background-container]{all:initial;display:flex;flex-direction:row;height:100%;overflow:hidden;width:100%}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background]>div[data-marpit-advanced-background-container][data-marpit-advanced-background-direction=vertical]{flex-direction:column}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background][data-marpit-advanced-background-split]>div[data-marpit-advanced-background-container]{width:var(--marpit-advanced-background-split,50%)}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background][data-marpit-advanced-background-split=right]>div[data-marpit-advanced-background-container]{margin-left:calc(100% - var(--marpit-advanced-background-split, 50%))}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background]>div[data-marpit-advanced-background-container]>figure{all:initial;background-position:center;background-repeat:no-repeat;background-size:cover;flex:auto;margin:0}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=background]>div[data-marpit-advanced-background-container]>figure>figcaption{position:absolute;border:0;clip:rect(0,0,0,0);height:1px;margin:-1px;overflow:hidden;padding:0;white-space:nowrap;width:1px}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=content],div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=pseudo]{background:transparent!important}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background=pseudo],div#\:\$p>svg[data-marpit-svg]>foreignObject[data-marpit-advanced-background=pseudo]{pointer-events:none!important}div#\:\$p>svg>foreignObject>section[data-marpit-advanced-background-split]{width:100%;height:100%}
+

Swate hands-on

-
-

Goals

+
+

Goals

  • Get familiar with ISA metadata and Swate
  • Annotate data in your ARC
-
-

Check Swate installation

-

☑️ Make sure Swate is installed:

+
+

Check Swate installation

+

☑️ Make sure Swate is installed:

  1. Open Excel (online or Desktop)
  2. Go to the Insert tab: Click the arrow next to "My Add-ins". There you should be able to select Swate.
  3. Go to the Data tab: you should see the Swate (Core) add-in.
-

💡 Alternatively, you can use Swate standalone
-(⚠️ this is however work in progress and likely to change)

+

💡 Alternatively, you can use Swate standalone
+(⚠️ this is however work in progress and likely to change)

-
-

Have a simple text editor ready

+
+

Have a simple text editor ready

  • Windows Notepad
  • MacOS TextEdit

Recommended text editor with code highlighting, git support, terminal, etc: Visual Studio Code

-
-

Download the demo data

+
+

Download the demo data

  1. Open the ARCitect
  2. @@ -59,60 +59,56 @@

    Download the demo data

  3. Navigate to Download ARC (4)
-
-

Download the demo data

+
+

Download the demo data

  1. Search for Talinum-CAM-Photosynthesis
  2. Click the download button, select a location and open the ARC.

-

💡 This is basically the ARC we created last session.

+

💡 This is basically the ARC we created last session.

-
-

Where we left off last time

-

👩‍💻 Initiated an ARC
-📂 Structured and ...
-🌐 Shared with collaborators

+
+

Where we left off last time

+

👩‍💻 Initiated an ARC
+📂 Structured and ...
+🌐 Shared with collaborators


Today we want to

... annotate the experimental data

-
-

Swate hands-on with demo data

+
+

Swate hands-on with demo data

-
-

Swate Overview

+
+

Swate Overview

-
-

Let's annotate the plant samples first

+
+

Let's annotate the plant samples first

  1. Navigate to the demo ARC.
  2. Open the lab notes studies/talinum_drought/protocols/plant_material.txt in a text editor.
  3. Open the empty studies/talinum_drought/isa.study.xlsx workbook in Excel.
-
-

Create an annotation table

+
+

Create an annotation table


Create a Swate annotation table via the create annotation table button in the yellow pop-up box OR click the Create Annotation Table quick access button.


-
-

💡 Each table is by default created with one input (Source Name) and one output (Sample Name) column

-
-
-

💡 Only one annotation table can be added per Excel sheet

-
+

💡 Each table is by default created with one input (Source Name) and one output (Sample Name) column

+

💡 Only one annotation table can be added per Excel sheet

-
-

Add a building block

+
+

Add a building block

  1. Navigate to the Building Blocks tab via the navbar. Here you can add Building Blocks to the table.
  2. Instead of Parameter select Characteristic from the drop-down menu (A)
  3. @@ -120,89 +116,101 @@

    Add a building block

  4. Select the Term with the id OBI:0100026 and,
  5. Click Add building block.
-
-

💡 This adds three columns to your table, one visible and two hidden.

-
+

💡 This adds three columns to your table, one visible and two hidden.

-
-

Insert values to annotate your data

+
+

Insert values to annotate your data

  1. Navigate to the Terms tab in the Navbar
  2. In the annotation table, select any number of cells below Characteristic [organism]
  3. Click into the search field in Swate.
-
-

💡 You should see organism showing in a field in front of the search field
-💡 The search will now yield results related to organism

-
+

💡 You should see organism showing in a field in front of the search field
+💡 The search will now yield results related to organism

  1. In the search field, search for "Talinum fruticosum"
  2. Select the first hit and click Fill selected cells with this term
-
-

Add a building block with a unit

+
+

Add a building block with a unit

  1. In the Building Blocks tab, select Parameter, search for light intensity exposure and select the term with id PECO:0007224.
  2. Check the box for This Parameter has a unit and search for microeinstein per square meter per second in the adjacent search bar.
  3. Select UO:0000160.
  4. Click Add building block.
-
-

💡 This adds four columns to your table, one visible and three hidden.

-
+

💡 This adds four columns to your table, one visible and three hidden.

-
-

Insert unit-values to annotate your data

+
+

Insert unit-values to annotate your data

In the annotation table, select any cell below Parameter [light intensity exposure] and add "425" as light intensity.

-
-

💡 You can see the numbers being complemented with the chosen unit, e.g. 425.00 microeinstein per square meter per second

-
+

💡 You can see the numbers being complemented with the chosen unit, e.g. 425.00 microeinstein per square meter per second

-
-

Showing ontology reference columns

+
+

Showing ontology reference columns

Hold Ctrl and click the Autoformat Table quick access button to adjust column widths and un-hide all hidden columns.

-
-

💡 You can see that your organism of choice was added with id and source Ontology in the reference (hidden) columns.
-⚠️ This feature is currently not supported on MacOS

-
+

💡 You can see that your organism of choice was added with id and source Ontology in the reference (hidden) columns.

+

⚠️ This feature is currently not supported on MacOS

-
-

Update ontology reference columns

+
+

Update ontology reference columns

Click the Update Ontology Terms quick access buttons.

-
-

💡 This updates all reference columns according to the main column. In this case the reference columns for Parameter [light intensity exposure] are updated with the id and source ontology of the microeinstein per square meter per second unit.

-
+

💡 This updates all reference columns according to the main column. In this case the reference columns for Parameter [light intensity exposure] are updated with the id and source ontology of the microeinstein per square meter per second unit.

-
-

Your ISA table is growing

+
+

Your ISA table is growing

At this point. Your table should look similar to this:

-
-

Hiding ontology reference columns

+
+

Hiding ontology reference columns

Click the Autoformat Table quick access button without holding Ctrl to hide all reference columns.

-
-

Exercise 📝

+
+

Exercise 📝

Try to add suitable building blocks for other pieces of metadata from the plant growth protocol (studies/talinum_drought/protocols/plant_material.txt).

-
-

Let's annotate the RNA Seq data

+
+

Add a factor building block

+
    +
  1. In the Building Blocks tab, select Factor, search for watering exposure and select the term with id PECO:0007383.
  2. +
  3. Click Add building block.
  4. +
  5. Add the drought treatment ("no water for 12 days", "re-water for 2 days") to the respective samples
  6. +
+

💡 There are different options to add the drought treatment.

+
+
+ +
    +
  1. In the Building Blocks tab, select Protocol Columns -> Protocol REF.
  2. +
  3. Click Add building block.
  4. +
  5. Add the name of the protocol file (plant_material.txt) to the Protocol REF column.
  6. +
+

💡 This allows you to reference the free-text, human-readable protocol.

+
+
+

Fill out source name and sample name

+

Transfer the sample ids from the protocol.

+
    +
  1. Invent names for Source Name (we do not have this information)
  2. +
  3. Use the sample names (DB_*) as Sample Name
  4. +
+
+
+

Let's annotate the RNA Seq data

  1. Navigate to the demo ARC.
  2. Open the lab notes assays/rnaseq/protocols/RNA_extraction.txt in a text editor.
  3. Open the empty assays/rnaseq/isa.assay.xlsx) workbook in Excel.
-
-

Use a template

+
+

Use a template

  1. Navigate to Templates in the Navbar and click Browse database in the first function block.
-
-

💡 Here you can find community created workflow annotation templates

-
+

💡 Here you can find community created workflow annotation templates

  1. Search for RNA extraction and click select
      @@ -212,68 +220,126 @@

      Use a template

    • Click Add template to add all Building Blocks from the template to your table, which do not exist yet.
-
-

Adding / Updating unit references

+
+

Adding / Updating unit references

Sometimes you need to add or update the unit of an existing building block.

  1. Select any number of rows of the Parameter [biosource amount] building block to mark it for the next steps.
  2. Open the Building Blocks tab
  3. In the bottom panel "Add/Update unit reference to existing building block", search for the unit "milligram". Select the unit term and click Update unit for cells.
    -💡 If you already had values in the main column they will be updated automatically.
  4. +💡 If you already had values in the main column they will be updated automatically.
  5. Click the Update Ontology Terms quick access button, to update the reference columns.
-
-

Remove building blocks

+
+

Remove building blocks

If there are any Building Blocks which do not fit your experiment you can use the Remove Building Block quick access button to remove it including all related (hidden) reference columns.

-

⚠️ Due to the hidden reference columns, we recommend not to delete table columns via usual Excel functions.

+

⚠️ Due to the hidden reference columns, we recommend not to delete table columns via usual Excel functions.

-
-

New process, new worksheet

+
+

New process, new worksheet

  1. Add a new sheet to the assays/rnaseq/isa.assay.xlsx) workbook.
  2. Add the template "RNASeq Assay"
-
-

Exercise 📝

+
+

Exercise 📝

Try to fill the two sheets with the protocol details:

  • assays/rnaseq/protocols/RNA_extraction.txt and
  • assays/rnaseq/protocols/Illumina_libraries.txt
-
-

Your ISA table is ready 🎉

+
+ +
    +
  1. Use the Sample Name of studies/talinum_drought/isa.study.xlsx as the Source Name to rna-extraction.
  2. +
  3. Use the Sample Name of rna-extraction as the Source Name to illumina-libraries.
  4. +
+
+
+ + +
+
+ flowchart LR + %% Nodes + S1(Seeds) + S2(Leaves) + M1(RNA) + P1>plant growth] + P2>RNA extraction] + P6>Illumina] + D2(fastq files) + %% Links + S1 ---P1--> S2 + S2 ---P2--> M1 + M1 ---P6--> D2 +
+
+
+
+ +
    +
  1. In the Building Blocks tab, select Output Columns -> Raw Data File.
  2. +
  3. Click Add building block.
  4. +
+

💡 You see a warning about a changed output column.

+
    +
  1. Click Continue.
  2. +
  3. Go to the File Picker tab and click Pick file names.
  4. +
  5. Select and open the *fastq.gz files from the dataset folder.
  6. +
  7. Copy / paste them to the Raw Data File.
  8. +
+

💡 This allows you to link your samples to the resulting raw data files.

+
+
+

Your ISA table is ready 🎉

Go ahead, adjust the Building Blocks you want to use to describe your experiment as you see fit.
Insert values using Swate Term search and add input and output.

-
-

A small detour on "Excel Tables"

+
+

Re-use a protocol (process.json)

+
    +
  1. Open the empty assays/metabolomics/isa.assay.xlsx) workbook in Excel.
  2. +
  3. Navigate to Templates in the Navbar and scroll down to "Add template(s) from file."
  4. +
  5. Click Upload protocol
  6. +
  7. Select the file "swate_agilent_gc.json" from the demo data.
  8. +
  9. Click Insert json
  10. +
+ +

💡 This adds not only an empty template, but a filled out table with keys (headers) and values (cells).

+
+
+

A small detour on "Excel Tables"

Swate uses Excel's "table" feature to annotate workflows. Each table represents one process from input (e.g. plant leaf material) to output (e.g. leaf extract).

Example workflows with three processes each:

  • Plant growth → sampling → extraction
  • Measured data files → statistical analysis → result files
+

💡 Excel tables allow to group data that belongs together inside one sheet. This is not to be confused with a (work)sheet or workbook.

-

💡 Excel tables allow to group data that belongs together inside one sheet. This is not to be confused with a (work)sheet or workbook.

workbook              (e.g. "isa.assay.xlsx")
  └─── worksheet       (e.g. "plant_growth")
           └─── table  (e.g. "annotationTable")
 
-
-

🚧 Known issues with ARCitect and Swate (Nov 2023)

+
+

🚧 Known issues with ARCitect and Swate (Nov 2023)

  1. Annotation within ARCitect is not yet available.
  2. Swate and ARCitect handle isa.study.xlsx / isa.assay.xlsx files differently.
-
-
-

Contributors

+
+
+

Contributors

Slides presented here include contributions by

  • name: Dominik Brilhaus
    @@ -290,5 +356,5 @@

    Contributors

\ No newline at end of file + \ No newline at end of file diff --git a/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/README.html b/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/README.html index 509be8046..98b44e67d 100644 --- a/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/README.html +++ b/docs/teaching-materials/disseminations/2023-11-15_CEPLAS-ARC-Clubs/README.html @@ -66,7 +66,7 @@

README 2023-11-15_CEPLAS-ARC-Clubs

- last updated at 2023-11-15 + last updated at 2023-12-04 See website locally @@ -88,7 +88,7 @@

if grep -q "^marp: true" "$unit" then - marp --html $unit --allow-local-files --theme-set $marpTheme + npx @marp-team/marp-cli@latest --html $unit --allow-local-files --theme-set $marpTheme fi done diff --git a/docs/teaching-materials/style/dataplant_marp-theme.css b/docs/teaching-materials/style/dataplant_marp-theme.css index fd1f94378..c11085b81 100644 --- a/docs/teaching-materials/style/dataplant_marp-theme.css +++ b/docs/teaching-materials/style/dataplant_marp-theme.css @@ -52,4 +52,24 @@ section::after { font-family: Calibri, sans-serif; font-size: 20px; color: #2D3E50 +} + +kbd { + padding:0.1em 0.6em; + border:1px solid #ccc; + font-size:0.75em; + font-family:Arial,Helvetica,sans-serif; + background-color:#f7f7f7; + color:#333; + -moz-box-shadow:0 1px 0px rgba(0, 0, 0, 0.2),0 0 0 2px #ffffff inset; + -webkit-box-shadow:0 1px 0px rgba(0, 0, 0, 0.2),0 0 0 2px #ffffff inset; + box-shadow:0 1px 0px rgba(0, 0, 0, 0.2),0 0 0 2px #ffffff inset; + -moz-border-radius:3px; + -webkit-border-radius:3px; + border-radius:3px; + display:inline-block; + margin:0 0.1em; + text-shadow:0 1px 0 #fff; + line-height:1.4; + white-space:nowrap; } \ No newline at end of file diff --git a/pagefind/fragment/en-us_486244e.pf_fragment b/pagefind/fragment/en-us_486244e.pf_fragment deleted file mode 100644 index d10de5759..000000000 Binary files a/pagefind/fragment/en-us_486244e.pf_fragment and /dev/null differ diff --git a/pagefind/fragment/en-us_52e04ba.pf_fragment b/pagefind/fragment/en-us_52e04ba.pf_fragment new file mode 100644 index 000000000..651122a8d Binary files /dev/null and b/pagefind/fragment/en-us_52e04ba.pf_fragment differ diff --git a/pagefind/fragment/en-us_980cd2c.pf_fragment b/pagefind/fragment/en-us_980cd2c.pf_fragment deleted file mode 100644 index 61097ce2f..000000000 Binary files a/pagefind/fragment/en-us_980cd2c.pf_fragment and /dev/null differ diff --git a/pagefind/fragment/en-us_b3d4d61.pf_fragment b/pagefind/fragment/en-us_b3d4d61.pf_fragment new file mode 100644 index 000000000..06133e0db Binary files /dev/null and b/pagefind/fragment/en-us_b3d4d61.pf_fragment differ diff --git a/pagefind/fragment/en-us_cdfa62c.pf_fragment b/pagefind/fragment/en-us_cdfa62c.pf_fragment deleted file mode 100644 index f6d2e24f2..000000000 Binary files a/pagefind/fragment/en-us_cdfa62c.pf_fragment and /dev/null differ diff --git a/pagefind/index/en-us_196e9ae.pf_index b/pagefind/index/en-us_196e9ae.pf_index new file mode 100644 index 000000000..5065ac085 Binary files /dev/null and b/pagefind/index/en-us_196e9ae.pf_index differ diff --git a/pagefind/index/en-us_2e57729.pf_index b/pagefind/index/en-us_2e57729.pf_index new file mode 100644 index 000000000..2d9fcbc5c Binary files /dev/null and b/pagefind/index/en-us_2e57729.pf_index differ diff --git a/pagefind/index/en-us_374197c.pf_index b/pagefind/index/en-us_374197c.pf_index new file mode 100644 index 000000000..a69e58a06 Binary files /dev/null and b/pagefind/index/en-us_374197c.pf_index differ diff --git a/pagefind/index/en-us_3fcd869.pf_index b/pagefind/index/en-us_3fcd869.pf_index deleted file mode 100644 index ac6693f00..000000000 Binary files a/pagefind/index/en-us_3fcd869.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_4325603.pf_index b/pagefind/index/en-us_4325603.pf_index deleted file mode 100644 index a2cafbed8..000000000 Binary files a/pagefind/index/en-us_4325603.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_5f3867b.pf_index b/pagefind/index/en-us_5f3867b.pf_index deleted file mode 100644 index f541a27b5..000000000 Binary files a/pagefind/index/en-us_5f3867b.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_5fd03a1.pf_index b/pagefind/index/en-us_5fd03a1.pf_index new file mode 100644 index 000000000..261e51b07 Binary files /dev/null and b/pagefind/index/en-us_5fd03a1.pf_index differ diff --git a/pagefind/index/en-us_6482a76.pf_index b/pagefind/index/en-us_6482a76.pf_index new file mode 100644 index 000000000..e36bb0511 Binary files /dev/null and b/pagefind/index/en-us_6482a76.pf_index differ diff --git a/pagefind/index/en-us_81a1b64.pf_index b/pagefind/index/en-us_81a1b64.pf_index new file mode 100644 index 000000000..174012ad4 Binary files /dev/null and b/pagefind/index/en-us_81a1b64.pf_index differ diff --git a/pagefind/index/en-us_83fd62e.pf_index b/pagefind/index/en-us_83fd62e.pf_index deleted file mode 100644 index d0e180849..000000000 Binary files a/pagefind/index/en-us_83fd62e.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_881d8f5.pf_index b/pagefind/index/en-us_881d8f5.pf_index new file mode 100644 index 000000000..bb0237723 Binary files /dev/null and b/pagefind/index/en-us_881d8f5.pf_index differ diff --git a/pagefind/index/en-us_b11afdd.pf_index b/pagefind/index/en-us_b11afdd.pf_index deleted file mode 100644 index 0cb9944da..000000000 Binary files a/pagefind/index/en-us_b11afdd.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_b4b5b89.pf_index b/pagefind/index/en-us_b4b5b89.pf_index deleted file mode 100644 index 7ba1ebd80..000000000 Binary files a/pagefind/index/en-us_b4b5b89.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_b4bdcbc.pf_index b/pagefind/index/en-us_b4bdcbc.pf_index new file mode 100644 index 000000000..b2156ebf3 Binary files /dev/null and b/pagefind/index/en-us_b4bdcbc.pf_index differ diff --git a/pagefind/index/en-us_b57bf1e.pf_index b/pagefind/index/en-us_b57bf1e.pf_index deleted file mode 100644 index 50c7b4e77..000000000 Binary files a/pagefind/index/en-us_b57bf1e.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_b58075f.pf_index b/pagefind/index/en-us_b58075f.pf_index deleted file mode 100644 index d4c50397c..000000000 Binary files a/pagefind/index/en-us_b58075f.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_b79f7f0.pf_index b/pagefind/index/en-us_b79f7f0.pf_index new file mode 100644 index 000000000..416634c08 Binary files /dev/null and b/pagefind/index/en-us_b79f7f0.pf_index differ diff --git a/pagefind/index/en-us_cccf20c.pf_index b/pagefind/index/en-us_cccf20c.pf_index deleted file mode 100644 index a48e94683..000000000 Binary files a/pagefind/index/en-us_cccf20c.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_d11438f.pf_index b/pagefind/index/en-us_d11438f.pf_index deleted file mode 100644 index 29db298bd..000000000 Binary files a/pagefind/index/en-us_d11438f.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_dc39931.pf_index b/pagefind/index/en-us_dc39931.pf_index deleted file mode 100644 index 1f8adda2b..000000000 Binary files a/pagefind/index/en-us_dc39931.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_df38bd4.pf_index b/pagefind/index/en-us_df38bd4.pf_index new file mode 100644 index 000000000..535cd0a58 Binary files /dev/null and b/pagefind/index/en-us_df38bd4.pf_index differ diff --git a/pagefind/index/en-us_e584324.pf_index b/pagefind/index/en-us_e584324.pf_index deleted file mode 100644 index 7f61ae081..000000000 Binary files a/pagefind/index/en-us_e584324.pf_index and /dev/null differ diff --git a/pagefind/index/en-us_eae4bdd.pf_index b/pagefind/index/en-us_eae4bdd.pf_index new file mode 100644 index 000000000..8866b7a7b Binary files /dev/null and b/pagefind/index/en-us_eae4bdd.pf_index differ diff --git a/pagefind/index/en-us_eb6f88e.pf_index b/pagefind/index/en-us_eb6f88e.pf_index new file mode 100644 index 000000000..ac5309956 Binary files /dev/null and b/pagefind/index/en-us_eb6f88e.pf_index differ diff --git a/pagefind/index/en-us_f4eb644.pf_index b/pagefind/index/en-us_f4eb644.pf_index new file mode 100644 index 000000000..8dde2c899 Binary files /dev/null and b/pagefind/index/en-us_f4eb644.pf_index differ diff --git a/pagefind/index/en-us_fa9d237.pf_index b/pagefind/index/en-us_fa9d237.pf_index deleted file mode 100644 index ee9dbd372..000000000 Binary files a/pagefind/index/en-us_fa9d237.pf_index and /dev/null differ diff --git a/pagefind/pagefind-entry.json b/pagefind/pagefind-entry.json index 2c57c4b99..b17bca852 100644 --- a/pagefind/pagefind-entry.json +++ b/pagefind/pagefind-entry.json @@ -1 +1 @@ -{"version":"1.0.4","languages":{"de-de":{"hash":"de-de_8e81e7fc77bd9","wasm":"de-de","page_count":1},"en-us":{"hash":"en-us_58809972e0fce","wasm":"en-us","page_count":422}}} \ No newline at end of file +{"version":"1.0.4","languages":{"de-de":{"hash":"de-de_8e81e7fc77bd9","wasm":"de-de","page_count":1},"en-us":{"hash":"en-us_5a474fe33056e","wasm":"en-us","page_count":421}}} \ No newline at end of file diff --git a/pagefind/pagefind.en-us_58809972e0fce.pf_meta b/pagefind/pagefind.en-us_58809972e0fce.pf_meta deleted file mode 100644 index 9381944c5..000000000 Binary files a/pagefind/pagefind.en-us_58809972e0fce.pf_meta and /dev/null differ diff --git a/pagefind/pagefind.en-us_5a474fe33056e.pf_meta b/pagefind/pagefind.en-us_5a474fe33056e.pf_meta new file mode 100644 index 000000000..007cac45f Binary files /dev/null and b/pagefind/pagefind.en-us_5a474fe33056e.pf_meta differ