Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This thesis investigates the economic effect of patents and the patent system through the lens of patent commercialisation. The thesis is composed of four chapters, where each chapter is an independent scientific paper. In the first chapter, we present a novel dataset called IPRoduct which contains patent-product linkage data. This dataset leverages the recently introduced method of patent marking, called Virtual Patent Marking (VPM), to match patents with their corresponding products. We provide an exploratory analysis of the dataset. We also investigate how the sample of patents in our dataset compares to the general population of patents at the USPTO. We find that virtually marked patents are generally more important than the average patent. This rich and expansive dataset, accessible for download at iproduct.io, is designed to offer an invaluable resource for researchers, innovators, and policy-makers alike. The studies presented in the second and third chapter of the thesis rely heavily on the IPRoduct dataset. In the second chapter, we investigate the effect of patent office delays on product market introduction using unique data on patent-product linkages. The balance between launching a product prior to receiving patent allowanceâwhich risks litigationâand waiting for a patent grantâwhich can impose financial burdens due to delays and missed opportunitiesâpresents a significant challenge for firms in product commercialization strategy, an area currently not well-understood. Our findings suggest that while most of the products are commercialized post-grant, firms are faster to commercialize post-grant the longer they have waited for the grant decision. We also find that patent grants increase the hazard of commercialization. In the third chapter, we adopt a data-driven approach to predict the likelihood of patent commercialization. The huge disparity in patent commercialization outcomes, with only a small proportion of patents protecting a commercialized product, makes the task of predicting patent commercialization likelihood an interesting topic. By leveraging patent-product data from IPRoduct, we assemble a rich dataset of both commercialized and non-commercialized patents, which forms the foundation for our machine learning classifier models. With these models, we manage to predict the commercialization outcome of patents with accuracy rates of up to 87 percent and F1 scores peaking at 89 percent. We also find that a substantial portion of the prediction power comes from ex-ante indicators available at the time of grant.In the fourth and final chapter, we present a methods paper. Patent documents are a central resource for scientists, yet they are complex and poorly understood. The purpose of this paper is to simplify the use of patent data for non-experts by introducing and elucidating crucial patent metrics. We also provide openly-available algorithms to calculate these metrics which vary in complexity.
Jérôme Baudry, Nicolas Christophe Chachereau, Bhargav Srinivasa Desikan, Prakhar Gupta
Gaétan Jean A de Rassenfosse, Gabriele Pellegrino