Wednesday, December 16, 2020

AWS Beanstalk Sensitive Info Exposed

The best way to learn something is to create projects from scratch.  Once I have free time I always try to create different types of projects.  Right now I decided to create an event-driven process to make data transformation/processing/real-time searching (Talking about the Big Data). I have standard ETL architecture using AWS S3, AWS SQS, AWS Lambdas. I also use AWS Lambdas to create data partitions on S3 and  Elastic beanstalk for data processing.  

I've just found that some sensitive information was exposed. I found the following header in the HTTP response:

Server: Apache/2/4/39 (Amazon) OpenSSL 1.0.2l-fips

The solution is pretty straightforward. We need to open the Apache documentation and read it. Note that if you need to completely remove the header you have to install the mod_security module. On the AWS EBS side, everything is simple. We need to create a new config for the beanstalk cluster. 

Location: Amazon S3/$bucket name/build/$app/.ebextensions/httpd/conf.d/

Next time I may share the entire project architecture - how I created real-time processing/searching on AWS env, why I use AWS Lambdas, why I create parquet files for the metadata, how I run distributed processing at AWS Beanstalk with Apache Ignite etc.


Have a good holiday!

Saturday, March 7, 2020

Fork Join pool - sometimes the solution is really simple


Let's discuss how people try to use the Fork Join pool in Java 7. Imagine that you have a class that extends RecursiveTask.


@Override
protected BigInteger compute() {
    if ((n - start) >= THRESHOLD) {
        Collection<FactorialTask> tasks = ForkJoinTask
                             .invokeAll(getSubTasks());
        BigInteger result = BigInteger.ONE;
        for (FactorialTask t : tasks) {
            result =  t.join().multiply(result);
        }
        return result;
    } else {
        return calculate(start, n);
    }

}


Ok, and do you know how some people try to transfer this code into the java 8 style? Yep, that's the code that you may find when you surf the web:

  @Override
   protected BigInteger compute() {
       if ((n - start) >= THRESHOLD) {
           return ForkJoinTask
             .invokeAll(createSubtasks())
             .stream()
             .map(ForkJoinTask::join)
             .reduce(BigInteger.ONE, BigInteger::multiply);
       } else {
           return calculate(start, n);
       }
   }


Off-course they still have additional methods:

    private BigInteger calculate(int start, int finish) {
        return IntStream.rangeClosed(start, finish)
                .mapToObj(BigInteger::valueOf)
                .reduce(BigInteger.ONE, BigInteger::multiply);

    }
    private Collection<FactorialTask> getSubTasks() {
        List<FactorialTask> tasks = new ArrayList<>();
        int mid = (start + n) / 2;
        tasks.add(new FactorialTask(start, mid));
        tasks.add(new FactorialTask(mid + 1, n));
        return tasks;
    }


In order to invoke:

public BigInteger factorial(Integer number){
    ForkJoinPool pool = ForkJoinPool.commonPool();
    return pool.invoke(new FactorialTask(number));
}

BUT wait! In Java 8+ we have a parallel stream, why don't rewrite this code in one line? Everything is  just simple:

IntStream.rangeClosed(1, number)
         .parallel()
         .mapToObj(BigInteger::valueOf)         
         .reduce(BigInteger::multiply);



Sometimes the solution is quite simple.