Uploaded image for project: 'Blazegraph (by SYSTAP)'
  1. Blazegraph (by SYSTAP)
  2. BLZG-1383

InlineIV dependent on Java compiler and system encoding settings?

    Details

      Description

      Hi,

      we used InlineIVs with great success so far, we saw almost adoubling in speed.

      However, a while ago a distribution zip created by my colleague failed an error in the bigdata.war. There was a problem serializing a string in the database using our custom InlineIVs-class.

      Looking into the matter we found there was a diffference between his compiled version of our InlineIV-class and my version. It had to do with accented characters. Using WinMerge we saw this:

      http://linkeddata.overheid.nl/terms/rvr-inachtnemingPrejudicieleBeslissing-naPrejudiciëleBeslissingVan

      vs

      http://linkeddata.overheid.nl/terms/rvr-inachtnemingPrejudicieleBeslissing-naPrejudiciëleBeslissingVan

      Looks like a ISO-8859-1 vs Unicode encoding issue. I have fixed our building process to specify the encoding and that seems to have resolved the issue. Apparently specifying the encoding is important in compiling a custom InlineIVs-class.

      However, there was another problem we noticed. Different versions of Java generate different values for the InlineIV-strings. This seems to be caused by the place of the "class" entry for our custom class, in java major version 52 it is generated before all strings, in Java major version after all strings:

      major version 52:
      #8 = Class #282 // nl/overheid/linkeddata/blazegraph/LinkedDataVocabularyDecl

      major version 51:
      #243 = Class #517 // nl/overheid/linkeddata/blazegraph/LinkedDataVocabularyDecl

      Due to the unpredictability we have removed the InlineIVs for now. While I have reported this as a bug, is there anything we can do to fix this, or de we have to wait for a new release with a fix?

        Activity

        Hide
        bryanthompson bryanthompson added a comment -

        Thank you for bringing this issue to our attention. For the first problem, specifying the encoding to be used would appear to be the correct approach. We do specify the encoding for the blazegraph compilation in the build.xml file.

        I am not clear on the second aspect of the issue that you have identified. You can also control the target JVM version for the generated byte code. This might address your second point. This is again something that we do in build.xml.

        It would help if you could explain the second aspect of the problem in some more depth and also help us to recreate this problem or clarify that specifying the compiler code generation target resolves the issue for you.

        Thanks,
        Bryan

        Show
        bryanthompson bryanthompson added a comment - Thank you for bringing this issue to our attention. For the first problem, specifying the encoding to be used would appear to be the correct approach. We do specify the encoding for the blazegraph compilation in the build.xml file. I am not clear on the second aspect of the issue that you have identified. You can also control the target JVM version for the generated byte code. This might address your second point. This is again something that we do in build.xml. It would help if you could explain the second aspect of the problem in some more depth and also help us to recreate this problem or clarify that specifying the compiler code generation target resolves the issue for you. Thanks, Bryan
        Hide
        hhv Huib Verweij added a comment -

        Hi Brian,

        I suspect specifying the target JVM will solve this problem.

        I will check it out because having the InlineIVs will be very beneficial to the performance.

        It was just something we ran into ever before. We use the .war distribution file of BlazeGraph and therefore never noticed a specific encoding or JVM. Apparently it is very important because in some areas the Java compiler really behaves differently.

        Best regards,

        Huib.

        Show
        hhv Huib Verweij added a comment - Hi Brian, I suspect specifying the target JVM will solve this problem. I will check it out because having the InlineIVs will be very beneficial to the performance. It was just something we ran into ever before. We use the .war distribution file of BlazeGraph and therefore never noticed a specific encoding or JVM. Apparently it is very important because in some areas the Java compiler really behaves differently. Best regards, Huib.
        Hide
        beebs Brad Bebee added a comment -

        Huib,

        Let us know how it turns out. In our CI environment, we have to explicitly set the file.encoding and sun.jnu.encoding. These are for EC2 instances running AMI Linux, where the default locale is not always set.

        java ...  -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 ...
        

        Thanks, --Brad

        Show
        beebs Brad Bebee added a comment - Huib, Let us know how it turns out. In our CI environment, we have to explicitly set the file.encoding and sun.jnu.encoding. These are for EC2 instances running AMI Linux, where the default locale is not always set. java ... -Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8 ... Thanks, --Brad
        Hide
        bryanthompson bryanthompson added a comment -

        Per the thread, setting the encoding and the compiler target are sufficient to provide compatibility.

        Assigned to Brad to capture this information in the user manual.

        Show
        bryanthompson bryanthompson added a comment - Per the thread, setting the encoding and the compiler target are sufficient to provide compatibility. Assigned to Brad to capture this information in the user manual.
        Hide
        beebs Brad Bebee added a comment -
        Show
        beebs Brad Bebee added a comment - Documented on Wiki/User's Guide at https://wiki.blazegraph.com/wiki/index.php/Installation_guide#Java_Requirements .

          People

          • Assignee:
            bradbebee bradbebee
            Reporter:
            hhv Huib Verweij
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: