Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DPL-823 [Part 1 - download] Import PBMC pool plates into Sequencescape #4017

Merged
merged 18 commits into from
Mar 11, 2024

Conversation

KatyTaylor
Copy link
Contributor

@KatyTaylor KatyTaylor commented Feb 7, 2024

See also - changes in Limber - sanger/limber#1576

Changes proposed in this pull request

  • Add new manifest for scRNA Core Pools Plate, and the download functionality for it - adding the feature to set the number of rows per well in the spreadsheet.

  • This feature is turned off until the upload functionality is implemented.

  • Includes refactor of the Plate Behaviour class.

Instructions for Reviewers

[All PRs] - Confirm PR template filled
[Feature Branches] - Review code
[Production Merges to main]
    - Check story numbers included
    - Check for debug code
    - Check version

- Plus some formatting / typo fixes
- rows_per_well
- for uploading pool plates, where there are multiple samples per well and we want to provide some sample-level data
- Specify it in the manifest type config
- Add it as a new column on manifest db table and model (doesn't necessarily need to be persisted, but that seems to be how the code works, that it accesses attributes saved against the manifest)
- When generating  in  in , add multiple entries per well, according to value of rows_per_well attribute
- Well_data eventually gets exported as @details_array, which is used to generate the downloaded spreadsheet
- Modify multiple code references to cope with possibility of multiple rows per well in well_data
…nifest

- Add test for rows_per_well in SampleManifest::Generator
- Checks details_array is of the correct length - this is what goes on to generate the rows in the spreadsheet
…3-Import-PBMC-pool-plates-into-Sequencescape
- it should be defined in Sequencescape, not just Limber, as it is needed to create a sample manifest using that purpose
- 'feature flag' the rows_per_well feature off, by commenting, because it shouldn't be activated until the manifests can be successfully uploaded
- skip the relevant tests until the feature is activated
- remove the in progress test in the feature test, because I think the generator tests are sufficient
- update a couple of feature tests that need to have their count incremented, due to adding an extra test manifest type
app/models/sample_manifest.rb Outdated Show resolved Hide resolved
well_data = map_ids_to_sample_ids.map { |map_id, sample_id| [maps[map_id], sample_id] }
well_data = {}
map_ids_to_sample_ids.each do |map_id, sample_id|
if (well_data[maps[map_id]])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform calls 'maps[map_id]' 3 times

well_data = map_ids_to_sample_ids.map { |map_id, sample_id| [maps[map_id], sample_id] }
well_data = {}
map_ids_to_sample_ids.each do |map_id, sample_id|
if (well_data[maps[map_id]])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform calls 'well_data[maps[map_id]]' 2 times

@@ -126,13 +129,15 @@ def generate_plates(purpose) # rubocop:todo Metrics/AbcSize
plates = Array.new(count) { purpose.create!(:without_wells) }.sort_by(&:human_barcode)

plates.each do |plate|
sanger_sample_ids = generate_sanger_ids(plate.size)
sanger_sample_ids = generate_sanger_ids(plate.size * @manifest.rows_per_well)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest::PlateBehaviour::Base#generate_plates calls '@manifest.rows_per_well' 2 times

@@ -49,6 +49,7 @@
has_uploaded_document :generated, differentiator: 'generated'

attr_accessor :override, :only_first_label
attr_writer :rows_per_well
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest#rows_per_well is a writable attribute

- collected by (collection site) is not required (confirmed by R&D)
- supplier name should be included as this is the external id for the sample - will be the original vacutainer barcode - donor id does not uniquely identify the sample
@KatyTaylor KatyTaylor marked this pull request as ready for review March 6, 2024 15:16
@KatyTaylor KatyTaylor self-assigned this Mar 6, 2024
@KatyTaylor KatyTaylor changed the title DPL-823 Import PBMC pool plates into Sequencescape DPL-823 [Part 1 - download] Import PBMC pool plates into Sequencescape Mar 6, 2024
well_data = map_ids_to_sample_ids.map { |map_id, sample_id| [maps[map_id], sample_id] }
well_data = {}
map_ids_to_sample_ids.each do |map_id, sample_id|
if well_data[maps[map_id]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform calls 'maps[map_id]' 3 times

well_data = map_ids_to_sample_ids.map { |map_id, sample_id| [maps[map_id], sample_id] }
well_data = {}
map_ids_to_sample_ids.each do |map_id, sample_id|
if well_data[maps[map_id]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform calls 'well_data[maps[map_id]]' 2 times

- hopefully the separation of code into methods makes more sense now
- there were lots of methods named very similarly
- got rid of block passing / yield pattern which was hard to read
- change the structure of well_data to make more sense when there are multiple rows_per_well
- moved code into GenerateWellsJob to avoid the back and forth

sample_manifest.core_behaviour.generate_wells_job(well_data, plate)
map_ids_to_sanger_sample_ids.each do |map_id, sanger_sample_ids|
plate.wells
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform calls 'plate.wells' 2 times

map_ids_to_sanger_sample_ids.each do |map_id, sanger_sample_ids|
plate.wells
.create!(map: maps[map_id]) do |well|
sanger_sample_ids.each do |sanger_sample_id|
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform contains iterators nested 3 deep

# plate_id => { map_id => [sanger_sample_id, sanger_sample_id, ...] }
def build_well_data(sanger_sample_ids)
@plates.each_with_object({}) do |plate, well_data|
well_data[plate.id] = {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest::PlateBehaviour::Base#build_well_data calls 'plate.id' 2 times

# for the delayed job will not be large enough and will truncate the data.
def build_wells_async(well_data)
@plates.each do |plate|
Delayed::Job.enqueue SampleManifest::GenerateWellsJob.new(@manifest.id, well_data[plate.id], plate.id)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest::PlateBehaviour::Base#build_wells_async calls 'plate.id' 2 times

# [{barcode, position, sanger_sample_id}, {barcode, position, sanger_sample_id}, ...]
def build_details_array(well_data)
@details_array = @plates.flat_map do |plate|
well_data[plate.id].flat_map do |map_id, sanger_sample_ids|
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest::PlateBehaviour::Base#build_details_array refers to 'plate' more than self (maybe move it to another class?)


sample_manifest.core_behaviour.generate_wells_job(well_data, plate)
map_ids_to_sanger_sample_ids.each do |map_id, sanger_sample_ids|
plate
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform calls 'plate; .wells' 2 times

def build_details_array(well_data)
@details_array =
@plates.flat_map do |plate|
well_data[plate.id].flat_map do |map_id, sanger_sample_ids|
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest::PlateBehaviour::Base#build_details_array refers to 'plate' more than self (maybe move it to another class?)

# plate_id => { map_id => [sanger_sample_id, sanger_sample_id, ...] }
def build_well_data(sanger_sample_ids)
@plates.each_with_object({}) do |plate, well_data|
well_data[plate.id] = {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest::PlateBehaviour::Base#build_well_data refers to 'plate' more than self (maybe move it to another class?)

@details_array =
@plates.flat_map do |plate|
well_data[plate.id].flat_map do |map_id, sanger_sample_ids|
sanger_sample_ids.map do |sanger_sample_id|
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SampleManifest::PlateBehaviour::Base#build_details_array contains iterators nested 3 deep

app/jobs/sample_manifest/generate_wells_job.rb Outdated Show resolved Hide resolved
# Ensure the order of the wells are maintained
maps = Map.find(map_ids).index_by(&:id)
well_data = map_ids_to_sample_ids.map { |map_id, sample_id| [maps[map_id], sample_id] }
map_ids_to_sanger_sample_ids.each { |k, v| create_well(k, v) }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateWellsJob#perform has the variable name 'v'

Copy link
Contributor

@sdjmchattie sdjmchattie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a few minor issues here. Also like was mentioned before, do we have any value in the unit tests that are being skipped via the xit command? If not, we should just remove them.

app/models/sample_manifest.rb Outdated Show resolved Hide resolved
app/models/sample_manifest/plate_behaviour.rb Show resolved Hide resolved
spec/models/sample_manifest/generator_spec.rb Outdated Show resolved Hide resolved
spec/models/sample_manifest/generator_spec.rb Outdated Show resolved Hide resolved
# Use a default value of 1 for rows_per_well if not set
def rows_per_well
1
# TODO: replace above line with below line to turn the rows_per_well feature on, when DPL-823 is complete
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO found

@KatyTaylor
Copy link
Contributor Author

I think there are a few minor issues here. Also like was mentioned before, do we have any value in the unit tests that are being skipped via the xit command? If not, we should just remove them.

Forgot to reply to this. Yes, they test the functionality being added here that is 'flagged off' by the commented out line. When I turn the feature on, I'll also turn the tests on.

Copy link

codeclimate bot commented Mar 11, 2024

Code Climate has analyzed commit b56a3af and detected 13 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 10
Duplication 2
Bug Risk 1

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 86.6% (0.0% change).

View more on Code Climate.

@@ -4,7 +4,7 @@
Struct.new(:sample_manifest_id, :map_ids_to_sanger_sample_ids, :plate_id) do
def perform
ActiveRecord::Base.transaction do
map_ids_to_sanger_sample_ids.each { |k, v| create_well(k, v) }
map_ids_to_sanger_sample_ids.each { |map_id, sanger_sample_id| create_well(map_id, sanger_sample_id) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the work around for this variable renaming pain is to use that Ruby syntax where you define a method to pass the values to instead of write it out in full like you have here. I don't know if this works when you have two values coming in, but I know you can push the values through another method when you write it something like:

Suggested change
map_ids_to_sanger_sample_ids.each { |map_id, sanger_sample_id| create_well(map_id, sanger_sample_id) }
map_ids_to_sanger_sample_ids.each &:create_well

Copy link
Contributor

@sdjmchattie sdjmchattie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New changes look good. Not sure if my suggestion about how to use each works or is helpful?

@KatyTaylor KatyTaylor merged commit a5a03c2 into develop Mar 11, 2024
21 checks passed
@KatyTaylor KatyTaylor deleted the DPL-823-Import-PBMC-pool-plates-into-Sequencescape branch March 11, 2024 14:40
@KatyTaylor KatyTaylor linked an issue May 2, 2024 that may be closed by this pull request
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DPL-823 Import PBMC pool plates into Sequencescape
2 participants