• Type: New Feature
    • Status: Accepted
    • Resolution: Unresolved
    • Affects Version/s: BIGDATA_RELEASE_1_2_0
    • Fix Version/s: None
    • Component/s: Query Plan Generator
    • Labels:


      Per [1], here are some interesting breakdowns of the total time:

      - 23% Parsing SPARQL queries
      - 17% Executing SPARQL queries (optimization plus evaluation)
      - 14% Query optimization (it looks like we spend MUCH more time optimizing queries than we do evaluating them).

      It looks like there is a lot of fat in the query optimization phase and perhaps in the query parser (which is known to drive the heap heavily).

      A big driver of the HEAP is the iterator() methods. That is coming out of AbstractList.listIterator(), Collections$SynchronizedCollection.iterator(), AbstractSequentialList.iterator(), and ModifiableBOpBase$NotifyingList.iterator().

      "java.util.AbstractList.listIterator()","12230","390928", "1"
      "java.util.Collections$SynchronizedCollection.iterator()","8016","256512", "2"
      "java.util.AbstractSequentialList.iterator()","2771","88672", "2"
      "com.bigdata.bop.ModifiableBOpBase$NotifyingList.iterator()","1321","42272", "2"

      Some of the big drivers for those iterator methods are:

      - com.bigdata.bop.BOpUtility.preOrderIterator2(int, BOp)
      - com.bigdata.bop.BOpUtility.annotationOpIterator(BOp)

      Striterator.addFilter() shows up BIG with back traces through BOpUtility$Expand, which is part of the same iteration pattern.

      This suggests that we could win big if we could improve our iteration patterns over the AST.

      [1] (Index cache for Journal)

        Issue Links



            • Assignee:
              michaelschmidt michaelschmidt
              bryanthompson bryanthompson
            • Votes:
              0 Vote for this issue
              4 Start watching this issue


              • Created: