Hey there people on the internet. Here's my solution for Grep - B Linux shell challenge from HackerRank. You can find answers to other Linux shell challenges via this link => https://blog.shasec.rocks/post/hackerrank-bash-challs. So let's get started. 

Challenge

> An Introduction to Grep

Grep is a multi-purpose search tool, which is used to find specified strings or regular expressions. A variety of options exist, which make it possible to use the command in several different ways and to handle many different situations. For example, one might opt for case-insensitive search, or to display only the fragment matching the specified search pattern, or to display only the line number of an input file where the specified string or regular expression has been found.

More details about common examples of grep usage may be read here.  

Before using grep it is recommended that one should become familiar with regular expressions, to be able to harness the command to its fullest.

> Recommeded References

15 Practical Grep Command Examples

TLDP Examples for Grep

Grep Regular Expressions

Grep Regular Expressions on the GNU site

> Current Task

Given an input file, with N credit card numbers, each in a new line, your task is to grep out and output only those credit card numbers which have two or more consecutive occurences of the same digit (which may be separated by a space, if they are in different segments). Assume that the credit card numbers will have 4 space separated segments with 4 digits each.

If the credit card number is 1434 5678 9101 1234, there are two consecutive instances of 1 (though) as highlighted in box brackets: 1434 5678 910[1] [1]234

Here are some credit card numbers where consecutively repeated digits have been highlighted in box brackets. The last case does not have any repeated digits: 1234 5678 910[1] [1]234

2[9][9][9] 5178 9101 [2][2]34

[9][9][9][9] 5628 920[1] [1]232

8482 3678 9102 1232

> Input Format

N credit card numbers. Assume that the credit card numbers will have 4 space separated segments with 4 digits each.

> Constraints

1<=N<=20

However, the value of N does not matter while writing your command.

> Output Format

Display the required lines after filtering with grep, without any changes to their relative ordering in the input file.

> Sample Input

1234 5678 9101 1234  
2999 5178 9101 2234  
9999 5628 9201 1232  
8482 3678 9102 1232

> Sample Output

1234 5678 9101 1234  
2999 5178 9101 2234  
9999 5628 9201 1232

> Explanation

Consecutively repeated digits have been highlighted in box brackets. The last case does not have any repeated digits: 1234

5678 910[1] [1]234

2[9][9][9] 5178 9101 [2][2]34

[9][9][9][9] 5628 920[1] [1]232

8482 3678 9102 1232

Solution

Grep man pages

Blog Image
Figure 1.0: Use the -E option if you're playing with regex

To be honest I don't know the regex the find duplicate digits. Google is your friend :)

Blog Image
Figure 1.1: Got what I needed. This is the starting point

Customizing the regex to suit our needs:

Blog Image
Figure 1.2: https://regexr.com

Let's break this down:

  1. Match a range of digits from [1-9]
  2. The brackets () are used to create a group
  3. Match a space
  4. \1 (backreference) used to reference the previously matched group. In this case group ([0-9])
Blog Image
Figure 1.3: https://regexr.com

Let's break this down:

  1. * (star) is used to match the repeated stuff of the backreference (\n)
#/bin/bash

grep -E '([0-9]) *\1'
Blog Image
Figure 1.4: Works on my terminal

This script works on my terminal but unfortunately, it does not work on HackerRank platform. Thanks to @AlleyPereira I learned that you got to escape the brackets. Without the need for the -E flag in order for it to work on HackerRank.

#/bin/bash

grep '\([0-9]\) *\1'
Blog Image
Figure 1.5: Solution passed all HackerRank tests cases

If you like content like this, please consider buying me a coffee.

Thank you for your support :)


👇 Share this post 👇


💬 Comment Section 💬