-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of the canopy model #288
base: develop
Are you sure you want to change the base?
Conversation
…py model implementation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #288 +/- ##
===========================================
+ Coverage 95.29% 95.35% +0.06%
===========================================
Files 28 34 +6
Lines 1720 2176 +456
===========================================
+ Hits 1639 2075 +436
- Misses 81 101 +20 ☔ View full report in Codecov by Sentry. |
for layer in np.arange(self.n_layers - 1): | ||
target_area = (layer + 1) * community.cell_area * (1 - canopy_gap_fraction) | ||
|
||
# TODO - the solution here is predictably closer to the upper bracket, might |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of putting a TODO in the code, creating an issue has a higher likelihood of this being seen and solved.
validate: Boolean flag to suppress argument validation. | ||
""" | ||
|
||
# TODO - could this merge with the stem crown area function? A lot of overlap, so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, should be an issue
# def calculate_total_canopy_A_cp(z: float, f_g: float, community: Community) -> float: | ||
# """Calculate total leaf area at a given height. | ||
|
||
# :param f_g: | ||
# :param community: | ||
# :param z: Height above ground. | ||
# :return: Total leaf area in the canopy at a given height. | ||
# """ | ||
# A_cp_for_individuals = calculate_projected_leaf_area_for_individuals( | ||
# z, f_g, community | ||
# ) | ||
|
||
# A_cp_for_cohorts = A_cp_for_individuals * community.cohort_number_of_individuals | ||
|
||
# return A_cp_for_cohorts.sum() | ||
|
||
|
||
# def calculate_gpp(cell_ppfd: NDArray, lue: NDArray) -> float: | ||
# """Estimate the gross primary productivity. | ||
|
||
# Not sure where to place this - need an array of LUE that matches to the | ||
|
||
# """ | ||
|
||
# return 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're placeholders for the next steps in this milestone, so easier for me to leave them there for now.
|
||
|
||
def test_calculate_heights(): | ||
"""Tests happy path for calculation of heights of tree from diameter.""" | ||
|
||
from pyrealm.demography.t_model_functions import calculate_heights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I don't know how I feel about moving imports into functions. I prefer them to be on the top of the file, for better overview. Is there a difference in performance if you draw the imports in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That style was advice from the RSE team at Imperial. The argument is that having the imports for the package being tested inside the functions isolates the actual importatation as part of the test. So, if stuff goes wrong with an import, a specific related test fails, rather than a whole module of tests (or the whole test suite).
I definitely see what you mean about having an overview of which code is getting tested but I buy their argument and we've seen it be useful in practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is a well-structured and thoroughly documented piece of scientific code. The use of numpy, type hinting, and input validation contributes to its robustness. My suggestions are for further improvement of the code, but not critical.
self, | ||
community: Community, | ||
canopy_gap_fraction: float, | ||
layer_tolerance: float = 0.001, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps worth outlining in the docstring this default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've actually got an issue about this (#270) - @j-emberton pointed out that the documentation of defaults and optional values isn't consistent throughout the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that this stops me fixing this specific instance 😄
stem_height=community.cohort_data["stem_height"].to_numpy(), | ||
m=community.cohort_data["m"].to_numpy(), | ||
n=community.cohort_data["n"].to_numpy(), | ||
validate=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate
is set to False
in these function calls. Should there be an option to set this to true via this class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should doc this better. The validation is intended to help people use the standalone functions correctly, but here the Canopy.__init__
should guarantee the correct inputs, and so the validation is turned off to improve run time by avoiding repeated validation within the root solver call.
If it's broken, that's a developer issue with the __init__
code, rather than something where a user could usefully turn validation back on.
self.stem_relative_radius: NDArray[np.float32] = ( | ||
calculate_relative_canopy_radius_at_z( | ||
z=self.layer_heights, | ||
stem_height=community.cohort_data["stem_height"].to_numpy(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same arrays are being converted to numpy in different calls. From performance point of view it might worth doing a conversion once and using those arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is the issue we discussed yesterday about using pandas.DataFrame
for the cohort_data
attribute of Community
. I think the answer is to change that attribute - we're never making use of the pandas functionality outside of creating the cohort_data
, so it would be much cleaner as a dictionary of np.array
s.
|
||
|
||
def solve_community_projected_canopy_area( | ||
z: float, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes a really long argument list can be a bit cumbersome so worth considering dataclass or namedtuple for such cases. However, that comes with different overheads so a bit of a style choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific module has a bunch of extremely similar but not identical signatures. We could bundle everything up in one dataclass (and indeed @AmyOctoCat had these using Community
for exactly that reason), but I'd like to retain the flexibility of using named arguments directly.
One reason for that is that I want to build some user facing helper classes to generate canopy data outside of a Community object, so it's probably easier to keep them like this, but I will be looking at that next.
raise ValueError("Invalid shape for the z value.") | ||
|
||
|
||
def calculate_relative_canopy_radius_at_z( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the inputs array can be reshaped to be broadcastable, then using numpy functions instead of python built-ins e.g. np.power
instead of **
can help with vectorization and improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to think about (and document) what shapes are sensible here. I believe that M**n
is faster if n
is a scalar but np.power(M, n)
is faster when n
is an array? Could get either in using this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think np.power will be faster if you are applying the operation to an array even if the exponent is a scaler. And you can further performance gains if you can apply np.vectorize to functions. Though I feel like I need to back up this with some testing!
I should add that I am happy to defer to James and Marion in terms of what they consider to important changes to make as they have better code and project knowledge than me. |
Description
This PR is to review, adapt and merge in @AmyOctoCat's code on the canopy object from #231. The main areas are:
The
t_model_functions
andcanopy_functions
modules have now settled down to which functions belong where.The
canopy_functions
module has now been extended to include the canopy vertical structure functions from WIP - 230 create a draft canopy class #231, including the relative canopy radius, stem crown area and stem leaf area at given heights, along with a solver function to find the height at which a canopy layer closes. This includes:Community
object as an argument and replacing it with explicit arguments for the community properties for that function. This makes the core functions more flexible.The
Canopy
object has been reworked to use this new functionality to get to the same end point as in WIP - 230 create a draft canopy class #231.I've also added unit tests for the canopy functions and a really simple test of the Canopy object.
Things to do later:
canopy.md
document to use this code.pandas
?Canopy
object to use it to generate absorbed radiation per stem per layer and hence generate productivity estimates.Fixes #286 (issue)
Type of change
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks