Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BrAPI v2 :: Errors in brapi/v2/search/germplasm #48

Open
mcrimi opened this issue Apr 29, 2022 · 3 comments
Open

BrAPI v2 :: Errors in brapi/v2/search/germplasm #48

mcrimi opened this issue Apr 29, 2022 · 3 comments

Comments

@mcrimi
Copy link

mcrimi commented Apr 29, 2022

URL: http://gigwa.bms-uat-test.net:8080/gigwa/rest/brapi/v2/search/studies
Database: Sorghum

Log-in with default admin credentials

POST http://gigwa.bms-uat-test.net:8080/gigwa/rest/brapi/v2/search/germplasm

Problem 1 (in my opinion 😊)

Given that a priori the BMS doesn't know the internal germplasmDbId of the individuals in Gigwa I though I would try searching by name in order to see if it would make a difference solving #46

Request body

{
  "germplasmNames": ["10_00984"],
  "variantSetDbIds": ["sorghum§1§1470.002"]
}

Response

{
    "metadata": {
        "status": [
            {
                "message": "Either a studyDbId or a list of germplasmDbIds must be specified as parameter!"
            }
        ]
    }

This doesn't make much sense to me as if I knew the germplasmDbId of my individual I wouldn't be using the search in the first place. Food for though.

Problem 2

But ok let's see what happens if I search by germplasmDbId

Request body

{
  "germplasmDbIds": ["10_00984"],
  "variantSetDbIds": ["sorghum§1§1470.002"]
}

Response

500 Internal server Error

cc: @GuilhemSempere @aliceboizet

@mcrimi mcrimi changed the title BrAPI 2.0 :: Errors in brapi/v2/search/germplasm BrAPI v2 :: Errors in brapi/v2/search/germplasm Apr 29, 2022
@GuilhemSempere
Copy link
Collaborator

Problem 1: Filtering just by name is something really difficult to support with the current specs. The composite IDs used by the API start with a prefix that points to a Gigwa database, and since names do not contain such prefixes, a search based on just names would require scanning all databases connected to the Gigwa instance, which is something we obviously want to avoid (especially problematic with variants that can be highly numerous).
My opinion is that having dynamic BrAPI endpoint base-URLs into which we could "insert" a database name would clearly simplify everything because then we could get rid of (or at least simplify) those composite IDs by limiting searches to a single database. I thought BMS was facing similar issues, is that right?
Anyway, for the time being maybe we could consider to make /search/germplasm able to search by name only if some studyDbIds are provided (be careful, your example mentions passing variantSetDbIds but I don't think this is supported by the call). Would that be helpful?

Problem 2: Directly stems from the fact that you're passing a name instead of an ID (i.e., the prefix is lacking)

@mcrimi
Copy link
Author

mcrimi commented May 3, 2022

Problem 1: I don't think that'll work for my use case. From the BMS I want to retrieve the list of studies where a given germplasm or set of germplasm (let's assume that I already have the proper germplasmDbID from Gigwa) is present as an individual. The problem is that I don't know a priori the studyDbIds—that's exactly what I'm trying to find out with the search, see?

Btw, I will always be searching for a single programDbId (database in Gigwa's nomenclatyre I think). I don't know if that would help with the performance. My GET would be:

GET http://gigwa.bms-uat-test.net:8080/gigwa/rest/brapi/v2/studies: 
  "Request Body": {
    "germplasmDbId"": "sorghum§1§10_00983",
    "programDbId": "sorghum"
  }

Problem 2: Indeed, I was lacking the prefix. Duh! Thanks! 😊

@GuilhemSempere
Copy link
Collaborator

I'm getting confused because at some points in this issue you mention searching studies, at others you mention searching germplasm...

Anyway I just saw that from v2.1 both BrAPI calls /search/studies and /search/germplasm do support a programDbIds parameter. So we should be able to require it in order to account for any parameter of type "name", and this would avoid scanning all databases indeed.

However this raises an important question: when passing multiple parameters to a BrAPI search call, should they be combined with an AND or an OR? In the present case we need to use AND to achieve what we want, but until now I always thought OR was making more sense (get me callsets related to those germplasm PLUS those involved in this study). I think we can't let each implementation decide how to behave, otherwise a given client would get inconsistent results when switching from a datasource to another. We probably need an extra parameter to say which operator to use... Don't know if it's been discussed before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants