* * *
Across a range of creative domains, individual careers are characterized by hot streaks, which are bursts of high-impact works clustered together in close succession. Yet it remains unclear if there are any regularities underlying the beginning of hot streaks. Here, we analyze career histories of artists, film directors, and scientists, and develop deep learning and network science methods to build high-dimensional representations of their creative outputs.
We find that across all three domains, individuals tend to explore diverse styles or topics before their hot streak, but become notably more focused after the hot streak begins. Crucially, hot streaks appear to be associated with neither exploration nor exploitation behavior in isolation, but a particular sequence of exploration followed by exploitation, where the transition from exploration to exploitation closely traces the onset of a hot streak. Overall, these results may have implications for identifying and nurturing talents across a wide range of creative domains.
* * *
A remarkable feature of creative careers is the existence of hot streaks. Despite the ubiquitous nature of hot streaks across artistic, cultural, and scientific domains, it remains unclear if there are any regularities underlying the beginning of a hot streak. Understanding the origin of hot streaks is not only crucial for our quantitative understanding of patterns governing creative life cycles but it also has implications for the identification and development of talent across a wide range of settings. Deciphering what predicts hot streaks, however, remains a challenge, partly due to the complex nature of creative careers. The lack of systematic explanations for hot streaks, combined with the randomness of when they occur within a career, paints an unpredictable, if incomplete, view of creativity across a diverse range of domains.
Of the myriad forces that might affect career progression and success, the strategies of exploration and exploitation have attracted enduring interests from a broad set of disciplines, prompting us to examine their potential relationship with hot streaks. Indeed, according to the literature, exploitation allows individuals to build knowledge in a particular area and to refine their capabilities in that area over time. This could be relevant for understanding hot streaks since exploitation allows individuals to “go deep” in a focal area to both establish expertise in that area and foster a reputation related to that expertise. Exploration, on the other hand, engages individuals in experimentation and search beyond their existing or prior areas of competency. Although exploration is more risky and consequently associated with larger variance in outcome, it may also increase one’s likelihood of stumbling upon a groundbreaking idea through unanticipated combinations of disparate sources. In contrast, exploitation, as a conservative strategy, may stifle originality and, may over time, limit an individual’s ability to consistently produce high-impact work. Taken together, the benefits and downsides to these contrasting approaches raise a fundamental question: Are career hot streaks reflective of exploration or exploitation behavior, or some combination of the two?
To answer this question, we develop computational methods using deep learning and network science and apply them to large-scale datasets tracing the career outputs of artists, film directors, and scientists. Specifically, we build high-dimensional representations of the artworks, films, and scientific publications they produce (Supplementary Note 1), which capture abstract concepts, styles, and topics represented therein, allowing us to trace an individual’s career trajectory on the underlying creative space (Supplementary Note 1). We further quantify the hot streak within each career by the impact of works one produced1, measured by auction price, IMDB ratings, and paper citations in 10 years, respectively. We then correlate the timing of hot streaks with the creative trajectories for each individual, allowing us to examine changes in the characteristics of the work one produces around the beginning of a hot streak.
* * *
To examine the art styles of each artist and their exploration and exploitation dynamics, we collected over 800 K images of visual arts from museum and gallery collections, covering the career histories of 2128 artists. Building on recent advances in computer vision, we use a transfer-learning approach to construct an embedding for artworks using deep neural networks . We generate a 200-dimensional embedding of each artwork (see “Methods” and Supplementary Note 1.1), and identify art styles through clusters on the 200-dimensional embedding space, allowing us to trace the evolution of art styles over the course of their careers.
The architecture of the deep neural network to build high-dimensional representation of artworks. We connect a pre-trained VGGNet with three fully connected layers and fine-tune the model with art style labels. The blue box indicates the convolutional layer and the yellow box the max pooling layer. The green bar shows the top styles predicted by the model for the input image (Image reproduced under Creative Commons Attribution 3.0 Unported license).
We construct the high-dimensional representation of artworks by combining the output from the first and third convolutional layer (blue arrows) and the second fully connected layer (red arrow). An illustration of the 64 filters in the first convolutional layer. We highlight the first filter, the original image, and the output after the image passing through the filter. The red box represents the size of the filter (3 × 3 pixel box).
The activation of four layers in VGGNet and the saliency map of the post-impressionism class. The saliency map visualizes the important pixels for predicting the post-impressionism. Layers close to the input capture low-level features, such as brush strokes, whereas the layers close to the output capture high-level features such as the shape of objects. d Word embedding for film plots. Target words are encoded as a binary vector and passed to the neural network. We use the hidden layer to represent the embedding of words and plots.
We apply DeepWalk to the co-casting network of 79 K films, to capture the co-occurrence of nodes from the trajectories of random walkers. We use the hidden layer of the model to represent the cast information. We concatenate the word embedding from plots and the node embedding from casts to construct a 200-dimensional vector to represent each film. f An illustration of the co-citing network among papers published by a scientist. Two papers are connected if they have at least one common reference, with link weight measuring the total number of references they share. Following prior work, we apply a community detection algorithm to the co-citing network and identify the topic of each paper as the community it belongs to.
* * *
It is important to note that while our results demonstrate significant and consistent relationships across domains, the overall effect size seems modest. On the one hand, this suggests that additional controls might further tighten the relationship. For example, after we control for authorship and the effect of collaborations, the effect size seems to magnify (Supplementary Note. On the other hand, it also suggests opportunities to examine other potential processes that may also underlie the onset of hot streaks. Indeed, real careers are complex, with heterogeneous influences operating across domains as well as a multitude of individual and institutional factors. Hence, it is plausible that additional factors may also be at work. In this study, we also tested several alternative explanations for the onset of hot streaks (Supplementary Note). Although each of these hypotheses we tested appears plausible by itself, we find that none of them shows consistent associations, indicating that none of these alternative hypotheses alone can account for the hot-streak dynamics we studied.
It is also likely that on an individual basis, the exploration–exploitation transition is further influenced by other external factors, such as shifting market conditions, social network structure, and disciplinary culture. Individuals may also receive short-term feedback (e.g., art critiques or peer reviews) that may offer additional signals shaping their career focus. As such, the patterns of exploration and exploitation may reflect personal initiatives as well as responses to external forces. Nevertheless, our results suggest that, despite the obvious heterogeneity in the settings we examined and the myriad factors that may affect career progression and success, the exploration–exploitation dynamics appears consistently associated with the onset of hot streaks across rather diverse domains.
The data-driven nature of our study indicates that it is not immune to two limitations common in this type of analysis. First, while the datasets we assembled in this paper represent large collections of career histories and outputs across a variety of domains, they are limited to individuals who have had sufficiently long careers providing enough data points for statistical analyses (Supplementary Note 1). Second, this paper presents correlational evidence, whose primary goal is to investigate empirical regularities associated with the onset of hot streaks. Future work using causal research designs may improve causative interpretations of the regularities reported here.
Furthermore, while this work mainly focuses on universal patterns related to the onset of hot streaks, there could be important domain-specific differences in the role of exploration, exploitation, and success that are worth investigating further. For example, our preliminary analysis suggests that the level of exploration and exploitation in science appears much stronger than in art or film directing . The number of styles/topics within each career also varies substantially across domains. While these cross-domain differences could flow from inherent differences in data and methods, assessing domain-specific patterns is an important direction for future work.
Notably, the sequence of exploration followed by exploitation closely resembles strategies observed in a wide range of natural and socio-technical settings, from animal foragin to human cognitive search, from multi-armed bandits and reinforcement learning to role oscillation between brokerage and closure in social network to changing innovation strategies over business cycles. It thus suggests that the sequential strategies of exploration followed by exploitation uncovered in this study may have broad relevance that goes beyond individuals’ careers. Lastly, the representation techniques used in this paper could open up promising avenues for research on creativity, offering a quantitative framework to probe the characteristics of the creative products themselves. Future advances in deep learning may enable researchers to incorporate more creative dimensions, and hence more fruitfully contribute to a computationally enhanced understanding of creativity.
* * *
Here is a direct link to the complete article.