Simon Maskell 'Steps Towards a New Approach for Data Science' , Friday May 21, 2pm GDT, 9am EDT on

Simon Maskell is giving a Stan overview talk at the UNINOVE in São Paulo, Brazil.

Sao Paolo presentation invite.pdf (358.2 KB)

New Approach for Data Science

Abstract: Abstract: The use of data science is
becoming increasingly prevalent in the context
of scientific progress. This appears to stem
from a combination of: accessibility of data;
availability of computing resources; useability
of tools; ambition.

Some of the progress relates to the use of
popular tools, specifically those employing
Deep Learning, that have the key advantage
that they can be configured relatively
straightforwardly to exploit huge computational
resources and can be readily applied to large
datasets. The result in commercial settings is
that the number of transistors being used for
Deep Learning is doubling every four months;
commercial ambition is dramatically outpacing
Moore’s law.

These techniques have the key disadvantage
that they do not fully capitalise on pre-existing
scientific understanding and do not work well in
the context of small data which is expensive to
collect (in terms of finances, time or ethical
considerations). This situation has led to the
development and adoption of probabilistic
programming languages (PPLs), tools that
enable users to describe scientific hypotheses
and the relationship of these hypotheses to
data. The tools use Bayesian statistics to make
inferences from the data and the use of PPLs is
currently growing exponentially. One popular
PPL is Stan.

In this talk, Simon will describe this context and
then explain how the team’s recent research
( Big Hypotheses) relates to the development of
a family of algorithms and techniques (SMC
Stan, Streaming Stan) that can exploit
large-scale parallel computational resources.
The overarching aim of this research is to make
it possible for us to generate accurate results in
the context of problems that have only recently
been considered impossible to solve. To bring
this idea to life ( and with reference to
CoDatMo), the problem of analysing data
pertinent to the prevalence of COVID ( eg in
Brazil) will be discussed with a focus what has
been done, what is happening
now and what might now be
possible in the future.

We will have time for questions
and answers and networking at
the end.

1 Like

Great! This is a nice opportunity for us to foster talent and future contributions in Brazil (specifically São Paulo). All are welcome to attend. The Google Meet that I have as an Enterprise faculty GSuite account can accommodate up to 250 participants. I hope this would be enough.

Thanks @breckbaldwin!

1 Like

Per the linked PDF, I’ve edited the title from “Friday March 19” to “Friday May 21”. Hope this is the correct thing to have done.

1 Like

@jscolar thanks, not sure how that happened.