Rapid diversification associated with ecological specialization in Neotropical Adelpha butterflies

Ebel et al. 2015 Rapid diversification associated with ecological specialization in Neotropical Adelpha butterflies Molecular Ecology 20: 2392-2405.


Figure 1 from Ebel et al. “Adelpha wing pattern and species diversity. (a) Examples of the nine Adelpha mimicry types. The number above each image indicates the number of species and subspecies with the pattern. From top left: A. iphiclus iphiclus, A. naxia naxia, A. thesprotia, A. cocala cocala, A. salmoneus colada, A. boreas boreas, A justina justina, A. zina zina, A. levona, A. rothschildi, A. epione agilla, A. lycorias wallisii, A. ethelda ethelda, A. leuceria juanna, A. gelania gelania, A. seriphia barcanti, A. mesentina mesentina, A. melona deborah. (b) Five species have a unique wing pattern. From left: A. seriphia egregia, A. demialba demialba, A. justina inesae, A. zina pyrczi, A. lycorias lara. (c) Adelpha species richness across the Neotropical region (modified with permission from Mullen et al. 2011).”

Lynsey McInnes

Lynsey Bunnefeld

This was a funny choice from Will as it seems much more up my street than his. Indeed, my colleague James Nicholls and co. are developing similar phylogenomic methodologies to look at rapid diversification within Inga. James uses a targeted sequence approach that seems to also have worked pretty well. But I am too lazy to make this a post about the pros and cons of different genomic techniques.

In fact I’m not sure what to make this post about. I’m not sure what I think about this paper. On the one hand, it clearly represents an amazing amount of work – processing the samples, doing the bioinformatics and the bazillion versions of phylogenetic reconstruction and the assessment of missing data effects. I could not find any details in the main text, but they also appear to have dated the tree (and apparently better than previous attempts). And then they find neat relationships between toxicity of a common host plant family and convergent mimicry patterns across a variety of species. It’s a really nice story.

On the other hand, I am still not 100% convinced by the robustness of the various available methods of character state reconstruction (not discussed in the text) or of diversification rate shift detection (discussed a little bit). No doubt about it, a better phylogenetic hypothesis helps make these tests more meaningful, but they still rely heavily on accurate dating (at least relatively) and on some degree of rate conservatism across the tree(s) – or else you might infer different rates on every branch.

Grumble grumble. I am consistently amazed that these methods, that seem so dodgy, often recover relationships that make ecological sense (as here). So I should probably stop complaining and concede that they might be recovering real patterns (at least now that the phylogenetic hypotheses are more robust and the effects of missing data or rubbish dating priors are better characterized).

So where to next? Should the focus be on improving these methods so it is easier to detect real patterns, should it be on collecting more data for interesting clades to fill in missing data (species, traits) or should it be on collating multiple such datasets to look for concordant patterns across clades? For instance, here, what are the plants doing? To answer that we need to consider the interaction of multiple plant families with a single butterfly genus. How do we do that?

And what are the limits of these macro-methods? This butterfly clade appears to be a recent and rapid radiation. How do we look at character evolution and predictors of rate shifts when species might still be hybridizing? Is there real scope to link population and phylogenomics? How quickly will technology and bioinformatics pipelines progress in order to use complete genomes (and tons of them) to answer such questions. My gut feeling is actually not that fast and that the next ten years or so are going to see a flurry of methods to continue asking these questions with dodgy, patchy data and then, in 10 years, we will have to start over when we are confronted with a whole different kind of dataset requiring a whole different set of techniques.

As ever, exciting times.

Will Pearse

Sorry this post is so late; entirely my fault, not Lynsey’s. This feels like an excellent paper to get back on the horse with, because (as all phylogenetics papers do these days), it makes me feel very old. I feel as if I just popped out for a packet of cigarettes and suddenly the whole world changed – pyRAD? Is that like… PAUP*? What year is it?

And yet, somehow, the problems are the same. We have thousands of loci, but we have to concatenate them. We have a wonderfully resolved tree, but we still have to use the same old ancestral state reconstructions. I’m not criticising this paper, which is excellent, and I’m not even sure I’m criticising the methods, which are the best we can do at present. Yet, somehow, I still feel a bit worried whenever I see any  sort of reconstruction – even (especially?) when I’m doing it myself.

Which is why it’s so nice to see methods being applied with care, and in such great diversity, to a system with strong a priori hypotheses. Above Lynsey asks whether we need new models or more data. It’s easy to look at the grey branches (“we don’t know”) on these phylogenies and think that we need more data, but in reality I think some of that would be gap-filling. The data we have has already cost a lot of blood and sweat, and is beyond a phylogeneticist’s wildest dreams a decade ago – is that not enough? What we need are explicit models that tell us what an adaptive radiation looks like. That does mean a lot more sweat, and it could mean even more data collection than these authors have already done, but without it, we’ll have no broad framework within which to place data-rich case studies such as these.

*in this case, the asterisk stands for “absolutely not”.


Botany 2014 – Model checking, data cleaning, and phylogenies galore

Phylogenetics. It’s just one damned argument over which set of information is more important after another. Or something like that. Taken from the play/film by Alan Bennett.

Will Pearse

Will Pearse

I was lucky enough to be invited to speak at the colloquium on Phylogenetic Comparative Methods organised by John Schenk at Botany this year. I’m not really a botanist (shhh!), but I’m glad I attended: lots of cool talks, and lots of nice people.

Stephen Smith‘s talk really impressed me; he addressed the tendency for phylogenetics to become divorced from standard evolutionary questions by showcasing how next-gen processing can link them back together again. Everyone who’s been through Evo101 remembers that gene duplication and selection pressures can give dodgy phylogenetic inference. He showed that, if we’ve essentially sequenced everything, variation among gene trees lets us measure and quantify such processes, giving more questions phylogenetics can answer. Apparently this is all possible with variants of his graph method, but please don’t ask me to explain how!…

Erika Edwards launched a rather scathing attack on a recent paper examining angiosperm radiation. I’m not entirely sure I agreed with all of her points, but I was saddened that she didn’t have time to get onto her examples of how assuming the same process operates throughout the whole of a phylogeny isn’t reasonable. I think this is a very good point, and I think models that allow for variation across clades are often ignored, but desperately needed.

Outside of the symposium, there were a lot of talks dedicated to the phylogenetics of particular clades. In almost every case I saw, people were looking for lots of loci they could target, and I very rarely heard anyone use the term RADSeq or SNP at any point. People seemed very concerned about gene trees, and I was surprised to hear it discussed in terms that sounded an awful lot like the great morphology vs. genetic data debates I remember being forced to read about. Indeed, at one point someone actually uttered the phrase “those of you over the age of 40 will remember…” and then proceeded to talk about how to know when we have enough information to conduct an analysis. Maybe there is a chance for past battles to help us in the future!

Phylogenetic Diversity Theory Sheds Light on the Structure of Microbial Communities

O’Dwyer, Kembel & Green. PLoS Computational Biology 8(12): e1002832. doi:10.1371/journal.pcbi.1002832. Phylogenetic Diversity Theory Sheds Light on the Structure of Microbial Communities


This is the prettiest way of showing how communities can be assembled from a wider meta-community I’ve ever seen (from O’Dwyer et al.)

Jenna Morgan Lang

Jenna Morgan Lang

It’s become sooo cliché to say this, but I just can’t help myself: It’s a very exciting time to be a microbial ecologist! You lovers of life writ large enough to be viewed with the naked eye have had all the fun so far. Spilling gut contents and watching predators eat prey to fashion food webs, hunkering down to observe the social behaviors among and within species in a tropical rainforest, catching, marking, and releasing things to understand how they move through space and time, counting, collecting, cataloging. Now it’s our turn!

But wait, when it comes to piecing together the interactions of microbial communities, we have no guts to spill, no behaviors to observe, and while I suppose that in theory one could capture, mark, and release a microbe, I would certainly never hope to recapture it. We have been doing a ton of the collecting, counting, and cataloging in recent years, thanks to cheap and easy 16S rDNA sequencing from diverse environmental samples. We have learned that there is a stupid amount of phylogenetic diversity almost everywhere we “look,” and we can infer, from the functional potential encoded in the genes of the few genomes we’ve sequenced so far, that they are interacting with each other and their environments in really interesting ways.

However, this paper argues, we are not yet very good at using our high-throughput sequencing data to answer questions about the fundamental ecological processes that drive microbial community assembly. Now, everything I know about ecology I learned from reading David Quammen’s Song of the Dodo in 1996. I honestly don’t even remember why – maybe it was the simplicity and applicability of the models, maybe it was Quammen’s excellent storytelling, but I fell in love with Island Biogeography. Not “devote the rest of my professional life to it” kind of love, but more like “I once went on the most awesome road trip, and every time I think back to it, I yearn for it again” kind of love. So when roughly 10 years later, the field of microbial community ecology went bananas, and I found myself smack dab in the middle of it, my thoughts immediately and frequently turned to Island Biogeography. How cool would it be to take these models that have been tested and tinkered with for decades and adapt them for microbial communities? Unfortunately, I was not equipped or inclined to actually do this sort of work. But, there are people out there like James O’Dwyer, Steven Kembel, and Jessica Green, who are.

Having said that, I’m hoping that the real ecologists will chime in about the nuts and bolts of the model described in this paper, because I just want to provide a context for it. The authors propose that their framework will allow us to address two issues.

The first, more pragmatic, issue is related to how we census microbial communities. We cannot simply stake off grid and sit down for a few hours identifying and counting species. We have to scoop up the entire grid, put it in a blender, and extract DNA from it. Typically, even after millions of observations (sequences), you will still encounter new species. Think rainforest canopy fogging, like times 100 million. So, a Species Abundance Distribution (SAD,) with # of species on the y-axis and abundance on the x-axis, will have a very long tail. No big deal, except that obtaining millions of sequences for every sample can still be tricky. For example, I recently sequenced 15 samples on an Illumina MiSeq. I obtained ~20 million high-quality 16S rDNA sequences. Ideally, this would be more than 1 million per sample, unfortunately (and this is common), for reasons we don’t yet understand, the number of sequences per sample ranged from ~98,000-~2million. Most methods (e.g., UniFrac) used to compare phylogenetic diversity (PD) between samples involve subsampling all to the smallest sample size. In this case, I’d be ignoring 18,323,722 sequences! That’s 92.5% of my data. And, forget about it if I want to compare my data to something collected 10 years ago, or to the samples of the future with their bajillion sequences per sample!

In walks the central result of this paper: an analytical method to obtain the expected phylogenetic diversity of a local sample from a larger community. This they term the Edge-length Abundance Distribution (EAD,) and it is an analogue to the SAD. But, instead of counting species and plotting them against their abundance, we are now plotting the total amount of branch length leading to a given number of tips against that number of tips. Or something like that… Anyway, this EAD displays approximately power law behavior, which apparently means that we can use it to do ecology!

One thing we can do with it is use it to normalize the UniFrac distance between differently sized samples, so that’s nice because it makes the first issue go away. The other thing that it can be used for is to start testing hypotheses about the ecological processes that contribute to microbial community structure, and they provide some proof-of-principle examples of its use with human microbiome data. For example, they asked whether the microbiome of someone’s hand has more or less PD than expected if the microbiome were derived from a random sampling of all microbiomes. If the PD is lower than expected, we might hypothesize that some environmental filtering is taking place (ahem, hand sanitizer). I don’t know that any particularly mind-blowing ecological questions were answered in their proof-of-principle application, but now that we microbial ecologists have this phylogenetic framework, we can extend it, and most importantly, start designing experiments with these interesting ecological questions in mind.

Will Pearse

Will Pearse

I really, really enjoyed this paper. Since I’m a methods nerd, I’m going to talk first about the method, and then about why the biology is exciting here.

I’ve wasted hours of my life shuffling species around to make null distributions, so a method like this that allows us to exactly and quickly compute a null expectation is amazing!  The derivation is extremely neat, but I found it initially confusing because I was stuck thinking about PD (phylogenetic distance). They don’t use the distance between species, rather the ‘opposite’ of phylogenetic distance between species: the distance between the crown of a clade and the root of a phylogeny. I’m very much at the limit of my maths here, but this does make me wonder one thing. If all of these expectations are based on branch lengths for complete clades in the metacommunity phylogeny (i.e., it counts the tips descending from a node), how appropriate is that for situations where a community doesn’t contain all members of a clade? In such cases, is the expected variance in PD meaningful, or would it under-estimate what we see in practice, because not all species within a clade are going to be present in a particular community? I find it hard to imagine that I’ve hit upon a central problem with the paper, so I’d be grateful if someone could comment and clear up my confusion!

Moving on to the biology. Community phylogeneticists spend a lot of time looking at the importance of how a source pool is defined spatially (look at this lovely paper someone wrote), and that we can find similar patterns in the human microbiome is wonderful. The classical explanation for clustering at wider scales (communities vs. other humans and other habitats) is that there’s habitat filtering, and that overdispersion should be found in tighter definitions of source pool (community vs. other humans in the same habitat) suggests competition within habitat type. I think it would be cool to have some more functional data on what these microbes are doing; this overdispersion might actually reflect facilitation, whereby different microbes are performing different ecosystem services and together they’re making a more stable community. That might sound a bit group-selection-y, but I’m certain it would be an interesting avenue to explore.

Lynsey McInnes

Lynsey McInnes

My heart kinda sank when I saw the paper for this week. One, I knew Will would have smarter things to say than me and two, I just don’t enjoy community phylogenetics papers…I ummed and ahhed over what to write for my post, to sit in the fence and mumble on about interesting facets of the paper or to jump right in and have a poorly thought out and largely uninformed rant about community phylogenetics. I’m going to have a go at the second, but try to not fall flat on my face. Ho ho ho.

My biggest gripe with CP is what is a community? How can it be circumscribed? And the same goes for the metacommunity/regional pool. Of course, you can find out cool and interesting things  about delimited communities (whole humans vs. noses, continents vs. ecoregions, whatever), but it all just feels so forced. The authors here appear well aware of this and devote admirably long tracts of the paper in highlighting that methods such as theirs are still quite dissociated from actual ecological mechanisms and processes that drive species dynamics.

While the authors’ advances here, doing away with tedious null distributions and endless simluations is definitely great, I just feel like there is still a gap to cross before these kinds of metrics really help us…either to understand some fundamentals of assembly processes or have more practical ends like guiding conservation decisions or informing public health policy. Yes, I’m being vague and I’m not even sure what I ultimately want or feel is possible to get out of similar analyses, but as it is, methods are getting more and more swanky but with no real advance. Yes, metacommunity size matters, no shit! Its always about scale, scale, scale, scale. I’m guessing because communities and metacommunities are bordering on arbritrary concepts, any metric will always depend on scale? No?

On less vague and ranty notes….some other thoughts that struck me. How did the authors generate phylogenies of the microbiome? What breadth of microbial diversity is found in humans and how well characterised is it? How robust are these methods to these kinds of mega phylogenies? This kind of thing probably interests me more than applying some crazy metrics?

Another thing that I wonder about CP and I’m fairly sure that some work on this kind of thing already exists is, what happens when you think about CP across trophic levels? Does bringing in trophic interactions help explain, or bring consistency to, the patterns observed? Because, of course, species don’t just interact with other co-occurring species in their clade.

I think I will stop here and leave the rant at that. Apologies to the authors, this was a well-written, balanced and indeed innovative paper…whose subject I just don’t happen like. Probably my loss more than anybody elses…

Molecular evolutionary signatures reveal the role of host ecological dynamics in viral disease emergence and spread

Dule-Sylvester et al., 2013. Philosophical Transactions of the Royal Society B 368 – 1614. DOI:10.1098/rstb.2012.0194. Molecular evolutionary signatures reveal the role of host ecological dynamics in viral disease emergence and spread

Below, we give our first impressions of this article. Please comment below, or tweet Will or Lynsey (maybe use #pegejc). Think of this as a journal club discussion group!

Will Pearse

Will Pearse

We covered this paper in a (real, live, in-person) journal club here at the University of Minnesota, so the views below are probably not just mine. So, if you like what’s written below, it’s from the community genetics reading group; if you don’t like what’s written below, it’s all from me.

This study links an epidemiological model of how rabies spreads among raccoons to the structure of the genealogy that the rabies would have given present-day sequence data. I really like the modeling framework this paper presents. Inferring ecological patterns directly from a genealogy is a brave thing to try and do, and while others have done similar work this is probably the most explicit model I’ve seen someone trying to fit. This is also the paper’s greatest weakness: I’m not sure that these models could ever be fit successfully to real data. Taking figure 3 as an example, the authors state that they can detect the influence of long-distance dispersal because the exponential growth model fits their data better; I don’t think we would ever get such neat graphs with real data, and the predictions of their linear and exponential models look too similar (to me) to distinguish between in the presence of experimental noise. Indeed, while the authors use parameters derived from real data, they don’t actually attempt to fit their models to real genetic data; I wonder if they would be able to do so.

Moving past those rather snarky comments, this paper interested me because they’re attempting to model the ecological processes that might produce a particular genealogical (phylogenetic) structure. By looking for what kinds of signals long-distance dispersal leaves in the genome of rabies, they’re able to make useful predictions about what the rabies is doing right now – that’s presumably a lot of help if you’re trying to control an epidemic. I’d never really thought about how perfect a system diseases are for eco-phylogeneticists – they jump from host-to-host, making lineages nice and separate, and they evolve really quickly. Let’s just ignore multiple infections and DNA saturation for a moment, and think about the opportunities for fitting these kinds of complex models. Maybe we can all start linking phylogenetic (whoops – genealogical) structure to explicit models of evolution that incorporate ecology, and in the process help better-understand disease dynamics. As an eco-phylogeneticist, that kind of excites me!

Lynsey McInnes

Lynsey McInnes

First, apologies for the delay to this week’s post – I got caught up in Easter Monday laziness and what follows is largely random thoughts that popped into my mind as I read this paper on the train into work this morning.

I really enjoyed the idea behind this paper. I haven’t read much of the literature around the eco-evolutionary dynamics of virus evolution, but it sounds like crazy fun. I have read A LOT of the literature around models of spatially-explicit diversification and this paper definitely made me want to see more cross-talk between these two research areas (neatly incorporating my new field of statistical phylogeography/population genetics).

(I think) just like Will, I was excited by the possibilities that the authors outline, impressed by their modelling framework, but dubious about some of their outcomes and the likelihood that such a detailed model could often be used for predictive inference. I’d be happy to proven wrong however, and have very little feeling for how much data is really need/exists for such models to be powerful for, e.g. public health decision making. I’m also not convinced by figure 2 – is there not a ton of pseudoreplication going on in there – should there not be only five data points (as in figure 1b). Dare I say it – how about a mixed effects model?

Although the authors did perform sensitivity analyses and spend time discussing the effects of landscape heterogeneity and demographic stochasticity on their ability to infer process, I would have liked to see have seen more exploration of the effect of missing or biased data (for example how noisy can the data be before signal becomes distorted/lost?). I concede I have not checked the supplementary information and this information might be in there…

As a side project, I’ve been thinking about the effects of dispersal on macroevolutionary diversification and it was refreshing in this paper to see local and long distance dispersal so simply made distinct. I think this clarity of distinction is lacking from macroevolutionary analyses (so that when people look for the effects of dispersal on diversification they get conflicting results depending on whether they are looking inside a restricted area or beyond it (to cut a long story short)). Here, the authors have clear hypotheses on the differences expected whether or not the host moves beyond its immediate neighbourhood. One imagines that there isn’t really two distinct categories, but there is certainly more than one. So, hooray.

This comment might have come across as overly negative. I did not mean it to. I really enjoyed reading this paper, it was extremely well-written and thought-provoking (such that someone with no real background in disease dynamics could understand both the rationale and methods). I am going to check out the other articles in the special issue of Phil Trans that this article came from and look forward both to seeing how these types of models develop and hopefully to pilfering some of these ideas across into macroevolutionary diversification (that is similarly affected by processing acting on ecological time-scales (always good to end on a blatant note of self-citation).

Treating fossils as terminal taxa in divergence time estimation reveals ancient vicariance patterns in the palpimanoid spiders

Wood et al. Systematic Biology 62(2): 264-284. DOI:10.1093/sysbio/sys092. Treating fossils as terminal taxa in divergence time estimation reveals ancient vicariance patterns in the palpimanoid spiders.

This is a guest post with April Wright. Below, we give our first impressions of this article. Please comment below, or tweet AprilWill or Lynsey (maybe use #pegejc). Think of this as a journal club discussion group!


April Wright

When I started graduate school, I envisioned doing some fusion of paleontology and molecular phylogenetics. What I didn’t envision is other researchers constantly asking me “Why?”. Why use morphology when you can have a bajillion base pairs of sequence data? Why use morphology when we have such nice, explicit models for sequence evolution?

But in the past year and a half, there’s been a series of really lovely papers forging a kind of truce between the morphological and molecular worlds, and I think this paper highlights why this is important: Fossil taxa are the only record we have of extinct organisms, and we can learn a lot about the world of yore from them. Seems like an obvious point, but from the sheer volume of people asking me “Why?!”, it apparently is not.

For a little bit of background, in 2011, Alexander Pyron authored a paper treating fossil taxa as tips, rather than calibration points on nodes, in a chronogram. And people, in both the molecular and morphological spheres were pretty excited about this. It’s an intuitively appealing idea. Often, fossils are placed on a tree as calibration points, but we don’t really know the fossil belongs where we’ve placed it. Treating the fossil as a tip in the tree allows the fossil to be placed with confidence. It’s a nice concept.

In discussions with coworkers, people at meetings and randos off the street, it became clear to me that not everyone was sold on the utility of idea. While many people liked the idea of treating fossils as tips conceptually, there are still questions about if this practice will actually result in any noticeable effect on tree estimation or the inferences drawn from those trees. The paper for this week is quite nice in that it makes use of real data from fossil and extant spiders that the authors want to use to make an inference about historical biogeography to test the effect of treating fossils as tips.

A challenge in integrating morphological and molecular data is the degree of asymmetry between data types. In combined analyses, often there are many species for whom molecular data is available, some species for whom morphological data is available, and only a handful of species with both. The net result of this is basically a molecular tree and a morphological tree held together by a couple of taxa. This isn’t the case with this paper, and I was impressed by the care taken with the sampling of morphological characters in extant and fossil spiders, though the fossils are not well-intercalated with the extant taxa (more on this in a moment).

One of the interesting results in this paper is that treating fossils as tips on the tree resulted in node ages that were older than when fossils were used as calibrations. This hasn’t been found in other studies (but, this is one of the first studies of its kind, so this pattern may be quite common, and we don’t know it yet). As I mentioned before, the fossils are not intercalated in with the extant taxa, instead branching from a single point. This odd result highlights that we need to do more research to understand the effects of using fossils as tips.

This study takes it one step further and uses the dated phylogeny they obtained to make a biogeographical inference about spiders. Using the software LaGrange, the authors looked at the historical ranges of the spider clades for which they had data. The authors support the conclusion that the breakup Pangaea into Laurasia and Gondwana lead to a vicariance event within spiders. This is a very cool result, though likely not very different from the one they would have received treating fossils as calibrations only.

I’m going on a bit long, so I’ll wrap up by saying that I really enjoyed this paper. I think this is a great example of going about a new type of analysis in a very thoughtful way. I think the primary result of a vicariance event in spiders at the Pangaea split is a pretty neat, punchy result, but there’s plenty in this paper for any methods dork to have fun with.

Will Pearse

Will Pearse

In theory, I’m a phylogeneticist, so I should probably have an opinion about how we best use fossil calibration points. As such, I’m not going to talk about spiders, or biogeography, and am essentially just going to talk about what I remember of Joe Felsenstein‘s talk at Evolution 2012 on this issue. I got very excited coming out of that talk, so I hope I’ve remembered the details correctly!

There is a very big problem in phylogenetics that I don’t think enough people talk about: how do we date a phylogeny? We can now build massive phylogenies with RAxML, but the output doesn’t tell us when things evolved, just what’s closely related to what. Programs like BEAST let us simultaneously estimate phylogenetic structure and timing of evolutionary divergence, but we need to calibrate our results with fossil data. Otherwise we’re just inferring dates based on molecular data, and ignoring when we know certain groups must have evolved given what we find in the fossil record.

Wood et al. argue that dating clades by using fossils to set prior distributions on how old they’re likely to be may not be the best approach. I agree with them. Instead of using fossils to date clades, they’re putting the fossils in as extinct taxa, and they building phylogenies around them. This is kind of neat; it means the species fossils represent become part of the tree, and the extant species get dated in the process of making a phylogeny with those dated fossils in it.  They argue that, when they use this method, their results are less driven by their prior distributions, and as a rather naïve Bayesian statistician (that’s a pun, stats fans), I agree with that. I want the signal in my data to drive my answer, not the constraints and assumptions I made at the beginning.

Felsenstein outlined what I view as almost an extension of this method. In it, you use morphometric measurements of the actual fossils, along with measurements of extant species, to figure out where fossils go within a clade. Essentially, this means you can figure what a fossil’s closest relatives within a clade were, what branch of a phylogeny they’re more like, and get the dating of the phylogeny for free because we know roughly when the fossil was put down. I view this as an improvement on the present method, because the fossil taxa are not left orphaned in their own sister group to the extant species (see figure 1 in the paper) – they’re nestled in there with them, which of course reflects how the clade actually evolved. The disadvantage is that (as far as I’m aware) it’s not implemented yet, and it is probably vastly more data-hungry.

Lynsey McInnes

Lynsey McInnes

First, thanks April for providing our first guest post and for picking a whopper of a paper!

Man, this paper was dense and I commend the authors whole-heartedly for steering a relatively clear path through the huge number of analyses performed and for extracting the relevant conclusions thoughtfully. These guys certainly know their methods, and their spiders!

I was convinced by their argument to include fossils as terminal taxa and liked their inclusion of uncertainty around the fossil ages. It would have been a shame to take one step forward (including fossils as terminal taxa) and one step back (pretending their age is known with certainty).  Their conclusion – use all the data – was hardly surprising but very nicely demonstrated.

I do wonder however whether the fossils would be as crucial if extinct and extant taxa had overlapping/the same ranges? While this paper is a cool example of distributions shaped by the break up of Pangaea, making the fossil information central to any valid conclusions, how important would fossil info be if it didn’t provide additional/new information about ancestral distributions? Presumably, in these cases, the terminal fossils would function more like extant taxa in the same area, lending more weight to any conclusions on distributions possible using extant taxa alone. I guess that would be a bit boring, and so Wood et al’s situation makes for a much more gripping tale.

Although any of the following would have made the paper excessively long and I imagine may have been covered in the Pyron or Ronquist papers the authors refer to, I wonder, totally naively, if one could partition out how strongly the fossils drive the results found, how much fossil data is needed to change the story, what happens when one fossil is misplaced (in time or space), what happens if preservation biases mean that more fossils are found in one region than another so the signal is somehow skewed…hm…I am sure there already exists a whole literature on dealing with these issues when using fossils as calibration points and this could be preyed upon to find out what would happen in this ramped up fossil use case.

That was a harsh paragraph to end on when discussing an admirably thorough, thoughtful AND neat paper and came out mostly because I’m at a bit of a loss as to how to critique such a piece of work. So I’ll just stop rambling right here!

%d bloggers like this: