Semantic Audio Studio Tools and Techniques



Yüklə 445 b.
tarix30.10.2018
ölçüsü445 b.


  • Semantic Audio

  • Studio Tools and Techniques

  • using MPEG-7

  • Dr. Michael Casey

  • Centre for Computational Creativity

  • Department of Computing

  • City University, London


Overview

  • MPEG-7 Tools

      • Low Level Audio Descriptors
      • Statistical Sound Models (Semantic ?)
  • Music Unmixing

  • Sound Classification

      • Automatic label extraction
      • “Semantic” processing
  • Segment Similarity, Structure Extraction Musaics

      • S-Matrix (Self-Similarity Matrix)
      • C-Matrix (Cross-Similarity Matrix)
      • Segment Replacement
      • Musaics


Semantic Audio Analysis



MPEG-7 Audio Descriptors



MPEG-7 Audio Descriptors



MPEG-7 Audio Descriptors



Some Useful Descriptors for Music Processing



EXAMPLE 1 MUSIC UNMIXING



AudioSpectrumBasisD



AudioSpectrumBasisD



AudioSpectrumBasisD



AudioSpectrumProjectionD



AudioSpectrumProjectionD









Music Unmixing

  • Linear basis projection using SVD and ICA

      • spectrum subspace separation
      • fast computation of subspace ICA
      • full-rate filterbank masking
  • Blocked ICA functions

      • subspace reconstruction Y = XVV
      • cluster subspaces to identify “tracks”
      • sum masked filterbank output to create audio




Music Unmixing Example (Pink Floyd: mono -> 9 subspace tracks)



EXAMPLE 2 AUTOMATIC AUDIO CLASSIFICATION







MPEG-7: Intelligent Music Browsing



Music Genre Classification:



Music Genre Classification



Semantic Audio: General Sound Taxonomy



DS: General Audio Classification



EXAMPLE 3 STRUCTURE EXTRACTION



Structure Discovery



SoundModelStatePathD



SoundModelStateHistogramD





S-Matrix



STRUCTURE EXTRACTION == SEGMENTATION



Structure Discovery











EXAMPLE 4 MUSAICS



Musaics (Music Mosaics)

  • C-Matrix : Cross-Song Similarity Matrix

      • Outer product of target and source histograms
  • Find segments similar to target segment

      • Similarity between all target and database segments
      • SORT columns of similarity matrix
  • Replace segments with similar material

      • Segmentation boundaries (beat alignment)
      • Replace with “best fit” using DTW on most similar segments
  • EXAMPLES



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics



Musaics

  • New Content by Similarity Replacement

      • C-Matrix: Cross-Song Similarity Map
      • 1 Target, Many Sources
  • Constraints

      • Preserve Rhythm by Beat Tracking
      • Preserve Beats by DTW alignment
  • Bigger Source Database == Better



Acknowledgements

  • International Standards Organisation

      • ISO/IEC JTC 1 SC29 WG11 (MPEG)
  • Mitsubishi Electric Research Labs

  • Massachusetts Institute of Technology

      • Music Mind Machine Group (formerly Machine Listening Group)
      • Paris Smaragdis, Youngmoo Kim, Brian Whitman
      • Iroro Orife, John Hershey, Alex Westner, Kevin Wilson
  • City University

      • Department of Computing
      • Centre for Computational Creativity



Dostları ilə paylaş:


Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2019
rəhbərliyinə müraciət

    Ana səhifə