Multi Query Optimization in Spark

The research extends the paper by Michiardi, P., Carra, D., & Migliorini, S. (2019a). “In-memory Caching for Multi-query Optimization of Data intensive Scalable Computing Workloads” to support Join Operators.
The focus of this writeup is the Chapter 3(Methodology):

  • An in-depth introduction to the chapter with references
  • An in-depth description of the methodology
  • recognizing the possibilities of shared computation, essentially setting up the search space by identifying common sub-expressions (search space initialization)
  • modifying the optimizer search strategy to explicitly account for shared computation and improve overall performance (optimization stage)
  • An in-depth description of the proposed algorithms with references.
    • Conclusion of the chapter
  • A better organization of the chapter and sections
    It is crucial that the write-up is precise and accurate, therefore, it necessitates a thorough comprehension of the concept to be written meticulously.

Sample Solution