ElasticSearch restore failed when s3-gateway is activated.

Hufffff, Unfortunately i met this edge case. I have recovered from this situation. Here’s my scenario.

  1. I am on ElasticSearch Version 1.1.0
  2. I have two data nodes. One is primary and other is replica.
  3. I am taking regular snapshots of my indexes.
  4. I am no more taking snapshots, So I have installed s3-gateway plugin to keep updating s3 buckets for persistent indexes.

Because of bulk import, i have stopped my replica to make import little faster. Once import get completed. I felt high CPU and Memory usage. And since i was aware that my indexes are safe because i am supporting s3-gateway. So i decided to restart remaining data node. Fuck…. It was a big mistake. When i tried to restart, it was not recovering all indexes. And we were about to launch our site in next two hours. And i am left with no index.

Struggling here and there, i came to know that i am suffered with Bug in ElasticSearch. I tried to follow instruction at the end of this thread where i was suppose to update/edit metadata file from s3-bucket. I did that but no luck.

Problem i found, All indexes and shards suppose to have _source folders. And i had so many indexes and their shards where _source folder was missing. And those indexes were unrecoverable. I have no solutions at that place and was literately sweating in Air Conditioned Room. 

Then one of my colleague, Narinder Kaur has joined me. And she gave me necessary support and we tried some more messes to fix it. Since i already made a mistake, So i took one backup of existing elasticsearch so that i would be able to back at same place in case of any other mess. And solutions we were planning to try was totally crap.

So, Solution we tried. and which actually works….. Wow!.

  1. I updated my elasticsearch.yml, and remove s3-gateway settings related to my s3 bucket.
  2. I stopped elasticsearch.
  3. I rename my old cluster (elasticsearch) to elasticsearch.original.
  4. Restarted Elasticsearch. And it created new blank cluster. where i have no indexes.
  5. I created all required indexes with the same number of shards and replicas i previously had. In my case i had 5 indexes and 5 shards per index.
  6.  Now i stop elasticsearch again.
  7. Start deleting (elasticsearch/nodes/0/indices/<index_name>/<0,1,2,3,4>/{index,translog}. And move (elasticsearch.original/nodes/0/indices/<index_name>/<0,1,2,3,4>/{index,translog}) to (elasticsearch/nodes/0/indices/<index_name>/<0,1,2,3,4>/{index,translog})
    Note: Here, i did not touch _state folder of blank indexes. And now my all indexes has _status folder in each shard and each index.
  8. I copied all indexes as in 5th step.
  9. Restart ElasticSearch. and i found all indexes were recovered.

Observation: Well you should run your all custom mappings in blank indexes. I found some errors because i did not execute my mapping.

Thank god, Now all indexes were recovered. And Thanks to Narinder Kaur, she got me required support at that time.

How to install go-daddy ssl certificate on amazon load balancer.

I was struggling around to get this done. And finally i’ve made that. Heres the straight Forward Steps.

Requirements & Prerequisites:

  1. Linux having openssl and apache installed.
  2. Open shell terminal on your Linux Box.

openssl genrsa -des3 -out private.key 1024
openssl req -new -key private.key -out www.your-web-site.com.csr

It will ask you to add some basic information. Make sure you have added “Common Name” as your domain, like “www.xyz.com”

  1. Go to www.godaddy.com along your ssl management control panel
  2. Select your Certificate. And click on Re-Key button.
  3. Copy content of “www.your-web-site.com.csr” and paste the content in “CSR” field. And press Re-Key.
  4. It will prompt you to download the key. I found, Apache, Other, Nginx are same. So use anyone. By the way, i used “Other” to download my keys.
  5. Now unzip the downloaded content. It should have two *.crt files.

Now back to your terminal.

openssl rsa -in private.key -out private.pem

Now you will have following files in your current location.

  1. private.key
  2. private.pem
  3. www.web-site.com.csr
  4. sf_bundle.crt
  5. your-domain.com.crt

Now open your load balancer console and add https support. it will prompt you to add following values.

  1. Certificate Name:* -> Put any friendly name
  2. Private Key:* -> Paste content of private.pem
  3. Public Key Certificate:* -> Paste content of your-domain.com.crt.
  4. Certificate Chain: -> Paste content of sf_bundle.crt

Save this and you are done. It is quite easy. No?

Continous Integration Experience.

As a DevOps, It was my first experience to create a strategy where we could have automatic deployment stack. It was really a great experience and i get in to complete execution flow of Agile Methodology. The whole idea is just to work around the cleaner continuous deployment, Where i could have all level of tests like unittest, codecoverage, acceptance testing and so on….
First i tried to get implement with Hudson, It was really great tool but i found it little time consuming. Might be it was my first drive. But in the 2nd inning, i try to get in Jenkin and it really makes me so comfortable. I just fell in love with Jenkin. I found so many “Jenkin Vs Hudson”. Well at this level, i have so personal preference of my experience. It might be the reason of my second inning. But i would like to remain with Jenkin. Now going to create a whole stake of CI server at my local environment for one of our project. Actually client is concerned to Amazon’s Cost so it really make sense to avail local resources to generate quality output. And My Team is also feeling same what i am.
Would like to thanks to the great contribution of Jenkin Community. They really made an out standing product. Hats Off for such contributors to the community.

Our development work flow with git.

It is pretty worth able to share this experience. I used to git since 2009 but using git in production is no longer than 3 months. We were using SVN before that and now moved to git. It is really cool. We were using it with Master branch only where all developers suppose to push and code is suppose to move to development server and after testing, it is suppose to deploy on production server. Which is really quite unstable process. and as we are getting in the requirement of better development with less efforts we seriously need to think about branching and all such stuff.
Well, i created a quite simple but power full scenario. Master branch will be now our production ready branch. And Development branch will be our dev server branch. These two branches are suppose to be in the system for infinite time. And i introduced some short life branches like “Feature branches” and “Release branches” which will really play great role in the architecture we are working.
We are using “Pivotal Tracker” for our Agile methodology, So when we have new mile stone with the story id. It means, developer need to create new branche with the name “Feature-“. This branche is suppose to be cloned from development branch and suppose to be in the system till the completion of the feature. and than merge back to development branch. So in the whole release we are suppose to complete all pivotal stories by story ids.
I am looking for some automatic process where all stories get started when developer creates the Feature branch. And when he deliver the whole feature and merge the branch back to the development. It should automatically change the status of the story to be “Delivered”. QA team will test and either accept or reject the corresponding story.
Will share if i could figure this out. Overall, really challenging and quite interesting situations i’m facing and finding solutions for them.

Being a DevOps At YourSports

WOW! This is what i was dreamed to be. Really exciting job. From the last couple of year, I did lots of work in social networking application, Cloud infrastructure management, Ruby. Since i joined YourSports, I am damn busy. Being a DevOps, And it is quite hard to get time to write something here. But i am trying again to get some time to write here. No doubt, This post is after so long. But i will try to keep this on. My job is to manage all project activities between engineers. Our Engineers are distributed team and i have to manage to communicate with them in different timezones. Yeah, somewhat difficult. But really enjoying. Cloud Infrastructure, This is something i am die fan of. It is really great feeling when we are involve with some thing which needs to invent and i am experiencing something like that. Working with Engineers from different communities and cultures also improved my management skills. What we are building is on elgg Which is itself makes me Proud over My Team (Karam (Elgg Developer) , Chetan Sharma (Elgg Developer), Narinder Kaur (Lucene Expert), Daniyal Nawaz(UI Engineer) ). Special Thanks to Elgg team for their incredible product. There is nothing you can not achieve with elgg even with EAV infrastructure, Which is elgg’s strongest point.

Round-robin at application level to Balance MySQL Database Load.

Round robin technique facilitates you to distribute your task on number of available resources even at different location. Huge traffic sites like Facebook has to has such techniques working at the background to serve as fast as possible. I would like to discuss one of my personal implementation experience for such a large potential social networking site. Cloud computing is really help full but it also needs logical approach at programming level.
Approach 1: Six servers architecture on amazon cloud.
WOW! I had implemented 1 load balancer, 1 mysql master db, 1 mysql slave db and 3 application server. Such architecture really can handle huge traffic. And as far as increase in traffic concerned, we can add more application servers anytime we need. so user requests get balanced on 3 application servers and they get response. But in my application i had one more problem. 1 click corresponds to about 100+ SQLs. Hmmmm, So MySql load is never balanced with this technique and it has to be. Because 1 request is triggering 100+ SQLs.
So i drill down more and decided to separate sql reads and writes. Ok so with this, i get an opportunity to divide separate Writes of MySQL db and initiated one mysql replicated server.
Does this really get me at the end of performance level?
No. Because in 100+ SQLs how my writes could be available. Session access time updates, Some counter updates etc. Not too many So My write server is still have resources of no use.
Here is where Round Robin comes in.
If i could able to develop a logic which distributes my 100+ SQLs to any number of replicated instances available. That could really work for me. Say i have 5 read servers for 100+ SQL. Than i can distribute around 20 SQL per server per request. And as we increase number of read server. System can adjust it self to distribute (SQL queries) / (Number of servers) (Qn / Sn). In this way, all of my server will work for every SQL requested from the system. And I could get maximum performance from servers. Because there is no use if we have 1000 Servers and 1 server is responding for 1 complete request. Because in this case 999 servers are free and which is wastage of Money. So i implemented that in My PHP application and that really makes sense to be available on Cloud to use maximum resources.

How to create custom amazon AMI.

Today, i am going to explain how you can create custom amazon ami to launch instance anytime later. This will have you clone of your server anytime you need. I am considering you are able to login your current running instance and you also have your private key and certificate downloaded on some location.

  1. Upload your private key and certificate on the running instance.
    scp -i path/of/yourkeypair.pem path/of/cert.pem /mnt
    scp -i path/of/yourkeypair.pem path/of/pk.pem /mnt
  2. Login to your instance and check if uploaded files are available in /mnt.
  3. ec2-bundle-vol -d /mnt -k /mnt/pk.pem -c /mnt/cert.pem -u 673491274719 -p name-of-ami

    This will take some time and create the desired ami to be uploaded in the bucket. So you can use that later anytime you need.

  4. Now upload your bundle to amazon s3 storage.
    ec2-upload-bundle -b <S3-bucket-name> -m /mnt/name-of-ami.manifest.xml -a <AWS-access-key-id> -s  <AWS-secret-access-key> --location US-EAST-1C

    Note: Remember to upload to an S3 bucket in correct region. Also: if the bucket does not exit, it will be created for you. (I’ve used a European bucket as an example.)

  5. Now we need to register AMI. Do following
    ec2-register <bucket-name>/sampleimage.manifest.xml --region US-EAST-1C

    It will return the new AMI ID (like ami-).

That’s it you are done with your custom ami.