- air
- ajax
- algorithm
- apple
- bitbucket
- braintapper_exchange
- charts
- chumby
- codeigniter
- cognos
- complexity
- crashplan
- crosstab
- dash
- dashboard
- date
- dbvisualizer
- decisions
- dimension
- dogfood
- dona_wong
- dropbox
- edward_tufte
- extension
- feature_checklists
- feature_excellence
- filemaker
- firefox
- firewall
- flot
- flowing_data
- fogbugz
- football
- free
- freenas
- freshbooks
- git
- github
- gm
- google_charts
- iPad
- javascript
- jdbc
- jedox
- mac
- macbook
- maps
- marsedit
- mercurial
- metaweblog
- metrics
- microstrategy
- monowall
- moo
- nathan_yau
- news
- nosql
- open_source
- palo
- pentaho
- pfsense
- printing
- programmers_interfaces
- rapidweaver
- regex
- regexr
- rest
- safari
- smoothwall
- sony
- sqlpower
- stackoverflow
- statistics
- stephen_few
- svg
- tablet
- ticket_agent
- time_machine
- tip
- tm1
- transformer
- trick
- typographic grid
- usability
- visualization
- vmware
- w3c
- web
- wiki
- wikkawiki
- work_management
- wsj
For some reason, Regular Expressions have been very difficult for me... much like rocket science.
They are, however, crazy powerful.
Something I've been trying to implement is pulling in time entries associated to a Fogbugz case and pushing them into the case's elapsed hours. Fogbugz time entry system is not idea for a services company, and most companies have their own time reporting system.
The methodology that I'm using is that in my Freshbooks time sheet, I'm using a text convention like this:
[123] Note goes here
Where the case number is in the square brackets. If the square brackets are left out, then there is no case associated with the time entry. Example below:
Note goes here
What I want to do is to use Pentaho to parse the note, and create a case number and note for me to insert into my data warehouse. I can then create a transformation that will, using Fogbugz's API, push the total hours for each case back up to Fogbugz, so that I can use them for Evidence Based Scheduling.
So here's the regular expression I used to extract the data:
(\[(.*)\])(.*)
So what this gives me in Pentaho are three data fields (using my first example above):
Field 1: [123]
Field 2: 123
Field 3: Note goes here
Not bad. But for the second example, without a case reference, I get:
Field 1: Null
Field 2: Null
Field 3: Null
I can easily fix this with a formula step in my transformation, but I know it can be handled in Regex. The only problem is that the knowledge is above my skill level.
Enter Stack Overflow. I basically summarized my scenario above, and in less than 5 minutes, I had a working answer.
(\[(.*)\])?\s*(.*)
Not very different than my original string, but it gives me:
Field 1: Null
Field 2: Null
Field 3: Note goes here
Now I can map Fields 2 and 3 to my time entry table using only one Pentaho step. Nice!
One more thing - if you're not using StackOverflow to solve your technical problems, isn't it about time you started?
