• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Fakebook Indexes for CSV import
#21
(01-20-2016, 10:52 PM)sciurius Wrote: I trust your code doesn't crash. Neither does mine Wink .

Good to know ;-)

(01-20-2016, 10:52 PM)sciurius Wrote: But my objection is that using only a starting page may not be sufficiently deterministic under all circumstances.
Given the example in my posting, you would need to collect all song data and sort on page number to be sure.

Yes; that's what my code does.  Do you envisage any problems with that?

(01-20-2016, 10:52 PM)sciurius Wrote: Also:
Quote:"5-" would mean page 5 and all subsequent pages until the next page not claimed by any other song

The usual interpretation (e.g., LaTeX): If you specify a range consisting of a hyphen (or any tie) but with one or two empty page numbers, the following will happen:

1. a range of the form -34 is taken to mean pages 1 to 34;
2. a range of the form 12- is taken to mean page 12 to last page;
3. a range of the form - (only hyphen) is taken to mean page 1 to last page.

Yes, but I would argue that the usual interpretation is less useful in this scenario than the meaning I am proposing.  Otherwise you manually have to go through every single song and provide its last page.

Another approach might be to provide a simple Ruby / Python script which automatically calculates the last pages via the same algorithm and then outputs an updated version of the CSV file with those numbers included.  Then the updated version would be fed into my PDFexploder / MSP / whatever else.  But this two-phase approach is less convenient for the end user and I don't see any real advantage to it.  Am I missing something?

(01-21-2016, 02:26 AM)sciurius Wrote: I'm going to carry the use of CSV metadata a step further.

First, I'm currently finishing a tool that reads iRealPro data and formats this into a nice PDF.

You mean a PDF containing a table of contents of all the songs?  My PDFexploder does that, using LaTeX actually:

https://github.com/aspiers/PDFexploder/b....latex.erb

(01-21-2016, 02:26 AM)sciurius Wrote: iRealPro songs contain a limited amount of metadata like title, composer, style, key, and tempo.
If the iRealPro data contains an iRealPro playlist, I produce a multi-page PDF document and the corresponding metadata CSV. In other words: a Fakebook plus index in one go.

My code originally combined multiple fakebooks into a single giant PDF which started with a huge ToC, and then had all songs from all fakebooks sorted alphabetically (rather than just concatenating all fakebooks together).  This was quite cool, but I realised that a huge PDF is unwieldly, and it's much nicer to have one song per PDF, because this works more smoothly regardless of what music reader you choose to use.  Even in MSP it makes it easier to build set lists by cherry-picking songs.

(01-21-2016, 02:26 AM)sciurius Wrote: For reasons not relevant here, my tool can also produce PNGs instead of a PDF. This brings me to the feature request to extend the use of metadata CSV for other imports (in particular, batch inport) as well.

For example, I have a folder with ChordPros or PNGs, each containing one song. I can batch import this folder, but it would be very nice if I could place a metadata CSV in the folder (or specify on the import dialog) so that all imported songs can have some metadata filled in.
I even think that given how far Mike already implemented support for metadata CSVs this won't be hard to add.
Implementation hint: Add a "filename" or "pathname" element to the CSV to match a file with its metadata.

Yes, that would be awesome, and would also make it easy to bulk import PDFs generated by my PDFexploder.
Reply
#22
(01-21-2016, 03:43 AM)aspiers Wrote: Yes; that's what my code does.  Do you envisage any problems with that?

Mostly that it is a total unneccessary complication.

Quote:Yes, but I would argue that the usual interpretation is less useful in this scenario than the meaning I am proposing.  Otherwise you manually have to go through every single song and provide its last page.
...
Another approach might be to provide a simple Ruby / Python script which automatically calculates the last pages via the same algorithm and then outputs an updated version of the CSV file with those numbers included. ... But this two-phase approach is less convenient for the end user and I don't see any real advantage to it. Am I missing something?

A rule of thumb is that when you generate data once, and process it often, it is better to put the overhead in the generating phase.
Johan
http://www.johanvromans.nlhttp://www.howsagoin.nlhttp://www.hetgeluidvanseptember.nl
Samsung Galaxy Note 2 (N8010) 10.1", Android 7.1.2 (LineageOS), AirTurn Duo.
Asus Zenpad (Z300M) 10.1", Android 7.0 (backup tablet).
Samsung A3 (SM-A320FG), Android 7.0 (emergency).
Reply
#23
Quote:
(01-21-2016, 02:26 AM)sciurius Wrote: First, I'm currently finishing a tool that reads iRealPro data and formats this into a nice PDF.

You mean a PDF containing a table of contents of all the songs?  My PDFexploder does that, using LaTeX actually:

I get the feeling that you do not know what iRealPro is.

Quote:This was quite cool, but I realised that a huge PDF is unwieldly, and it's much nicer to have one song per PDF, because this works more smoothly regardless of what music reader you choose to use.  Even in MSP it makes it easier to build set lists by cherry-picking songs.

MSPro works with songs, and it does not matter whether a song corresponds to a single-song PDFs or a page selection from a huge PDF.


Attached Files
.pdf   x.pdf (Size: 361.31 KB / Downloads: 2)
Johan
http://www.johanvromans.nlhttp://www.howsagoin.nlhttp://www.hetgeluidvanseptember.nl
Samsung Galaxy Note 2 (N8010) 10.1", Android 7.1.2 (LineageOS), AirTurn Duo.
Asus Zenpad (Z300M) 10.1", Android 7.0 (backup tablet).
Samsung A3 (SM-A320FG), Android 7.0 (emergency).
Reply
#24
(01-21-2016, 04:44 AM)sciurius Wrote:
(01-21-2016, 03:43 AM)aspiers Wrote: Yes; that's what my code does.  Do you envisage any problems with that?

Mostly that it is a total unneccessary complication.

I would suggest it's necessary in order to provide a more convenient experience to the user (i.e. the person building the CSV indices), because it optimizes the most common case which is that rows in the index correspond to the order in the fakebook, and that songs appear on contiguous pages within the fakebook.

(01-21-2016, 04:44 AM)sciurius Wrote:
Quote:Yes, but I would argue that the usual interpretation is less useful in this scenario than the meaning I am proposing.  Otherwise you manually have to go through every single song and provide its last page.
...
Another approach might be to provide a simple Ruby / Python script which automatically calculates the last pages via the same algorithm and then outputs an updated version of the CSV file with those numbers included. ... But this two-phase approach is less convenient for the end user and I don't see any real advantage to it. Am I missing something?

A rule of thumb is that when you generate data once, and process it often, it is better to put the overhead in the generating phase.

For performance optimizations on huge data sets, I entirely agree. However such optimizations are entirely unnecessary here, and they come at the cost of a smoother UX.
Reply
#25
(01-21-2016, 04:49 AM)sciurius Wrote:
Quote:
(01-21-2016, 02:26 AM)sciurius Wrote: First, I'm currently finishing a tool that reads iRealPro data and formats this into a nice PDF.

You mean a PDF containing a table of contents of all the songs?  My PDFexploder does that, using LaTeX actually:

I get the feeling that you do not know what iRealPro is.

I'm not sure why you get that feeling; I've been using iRealPro heavily since long before it got renamed from iRealB.  Perhaps it was because my last statement was a bit misleading - the PDFexploder doesn't take iRealPro data as input, but it does generate a ToC PDF, which is what I got the impression (maybe incorrectly) that your tool also does, once it's parsed the iRealPro data.

(01-21-2016, 04:49 AM)sciurius Wrote:
Quote:This was quite cool, but I realised that a huge PDF is unwieldly, and it's much nicer to have one song per PDF, because this works more smoothly regardless of what music reader you choose to use.  Even in MSP it makes it easier to build set lists by cherry-picking songs.

MSPro works with songs, and it does not matter whether a song corresponds to a single-song PDFs or a page selection from a huge PDF.

"It does not matter" is true in the sense that MSPro supports page selections from a huge PDF.  But that kind of misses my point about the UX.  It's easier to import when there is no page selection to configure.  There are other advantages to exploding a big PDF prior to import too, e.g. the potential to save space on your device by only importing the songs you really need, and it makes it easier to open songs using other PDF readers too (which is particularly important to me right now due to http://zubersoft.com/mobilesheets/forum/...p?tid=3224 ...)  A third one is that an autogenerated ToC PDF can have hyperlinks to the PDF for each song, and this will work reliably without requiring that your PDF reader supports deeplinking from one PDF to a fixed page inside another PDF.

Having said that, the pros and cons of each approach here are marginal, so I really don't think it's worth debating them in too much depth (and I don't have the time to continue with it anyway).  Both approaches are perfectly valid, and both will suit users with differing use cases, so both are worth having as options.  Enough said :-)
Reply
#26
Quote:Both approaches are perfectly valid, and both will suit users with differing use cases

Indeed.
Johan
http://www.johanvromans.nlhttp://www.howsagoin.nlhttp://www.hetgeluidvanseptember.nl
Samsung Galaxy Note 2 (N8010) 10.1", Android 7.1.2 (LineageOS), AirTurn Duo.
Asus Zenpad (Z300M) 10.1", Android 7.0 (backup tablet).
Samsung A3 (SM-A320FG), Android 7.0 (emergency).
Reply
#27
I have a number of index files in Excel XLS format that I would like to share as soon as the format is clear and I find the time to make necessary adaptions.
Back in the good old pen/paper/binder time I came across the first fakebooks as PDFs. I kept (and still keep) them on the PC and printed single songs for my ring binders. It soon was clear that finding songs is essential. So I started building a database. Some TOCs I could find in the internet some are scanned and OCR'ed. Proof reading, completing and correcting was and still is a time consuming task, a never-ending story.
I keep an XLS file per fakebook for later reference and imported them into an Access database. With MSP the PDFs became much more usable. So I was motivated to invest time in completing the indexes.
I was involved in lengthy disussions in this forum about keeping big fakebook PDFs or splitting them. To keep my library clear I copy useful fakebooks to the tablet (one big PDF per book) but import only those songs that I really want to play. For every book I import the fakebook's TOC pages as one song, the whole book as a second "song". And I export from my database the indexes of only those fakebooks that are on the tablet as one big XLS song index. This way I have some hundreds of songs in a well-maintained library and several thousands at hand that I can use within minutes. I can jump to the respective page in the fakebook via MSP's "go to page" or add a song to the library by copying an existing song in MSP and copy/paste the meta data from the XLS. That works fine so far regarding fakebooks and I will keep that workflow. Songs that are permanently in my repertoire are individual one file per song, fine-tuned and exported as PDF from Finale, MuseScore, WinWord... or ChordPro. But that's another topic.
first language: German
Acer A1-830, Android 4.4.2 - HP x2 210 G2 Detachable, Win 10 1709
http://www.moonlightcrisis.de - http://www.basdjo.de - http://www.frankenbaend.de


Reply
#28
Back to technical details:
What I keep in my index XLS's and the database, see the Firehouse Jazzband example:
Index - the index as listed in the books TOC
in case of the above example it's a song number, in most cases it's a page number and I came across a book that has genre chapters that skip a number of pages after every chapter to allow sorting in more songs in a later edition to the correct chapter without changing all the page numbers
PDFPage - the starting page in the PDF as required for MSP page order or PDF exploder
PDFLastPage - the last page of the song in the PDF, filled only for songs with more than one page
IMHO it makes sense to keep it per song in the XLS so that it is in place when the XLS is sorted alphabetically by title
PageOrder - what MSP needs to access the songs correctly
mostly calculated by Excel macros with individual corrections for e.g. mixed up pages
AlternativTitel - allows a second title entry
useful for e.g. Autumn Leaves = Les Feuilles Mortes, Manha de Carnaval = Black Orpheus = Orfeo Negro and so on
Titel and Key do not need t be explained.
I usuallly do not maintain composer, year, genre and more. For me that's not important enough to be worth the effort. Anybody's welcome to add more...
first language: German
Acer A1-830, Android 4.4.2 - HP x2 210 G2 Detachable, Win 10 1709
http://www.moonlightcrisis.de - http://www.basdjo.de - http://www.frankenbaend.de


Reply
#29
(01-21-2016, 02:26 AM)sciurius Wrote: I'm going to carry the use of CSV metadata a step further.

First, I'm currently finishing a tool that reads iRealPro data and formats this into a nice PDF. iRealPro songs contain a limited amount of metadata like title, composer, style, key, and tempo.
If the iRealPro data contains an iRealPro playlist, I produce a multi-page PDF document and the corresponding metadata CSV. In other words: a Fakebook plus index in one go.

For reasons not relevant here, my tool can also produce PNGs instead of a PDF. This brings me to the feature request to extend the use of metadata CSV for other imports (in particular, batch inport) as well.

For example, I have a folder with ChordPros or PNGs, each containing one song. I can batch import this folder, but it would be very nice if I could place a metadata CSV in the folder (or specify on the import dialog) so that all imported songs can have some metadata filled in.
I even think that given how far Mike already implemented support for metadata CSVs this won't be hard to add.
Implementation hint: Add a "filename" or "pathname" element to the CSV to match a file with its metadata.

That's a very interesting idea that I'll have to look into adding at some point.  I currently look for a PDF that matches the CSV name, and if that isn't found, I expect the first line of the CSV to contain a filename. I'm sure this doesn't really match the CSV spec, but I added it just in case it would be useful.  It seems odd to me to have a column for filename for the current CSV import mechanism, when that column doesn't apply to any of the song metadata (and then you have to set something for that column for every song).  I suppose if I supported populating metadata for new songs created through other import mechanisms, the filename could be specified as a column, but that would definitely require different parsing to handle that.
Reply
#30
Yes, the proposal to add a "filename" column applies to bulk importing multiple files from a directory. It is not necessary for the current PDF/CSV import.
Johan
http://www.johanvromans.nlhttp://www.howsagoin.nlhttp://www.hetgeluidvanseptember.nl
Samsung Galaxy Note 2 (N8010) 10.1", Android 7.1.2 (LineageOS), AirTurn Duo.
Asus Zenpad (Z300M) 10.1", Android 7.0 (backup tablet).
Samsung A3 (SM-A320FG), Android 7.0 (emergency).
Reply


Digg   Delicious   Reddit   Facebook   Twitter   StumbleUpon  


Users browsing this thread:
1 Guest(s)


  Theme © 2014 iAndrew  
Powered By MyBB, © 2002-2018 MyBB Group.